Investarget is a large China-based investment bank, with more than 1200 active global deals and projects they're working on. To more effectively manage the immense volume of projects, it would be very useful to automate the process of evaluating a potential deal, quantifying aspects such as risk (legal, financial, and otherwise), potential for success, etc. Investarget has no in-house data science or ML team, and as such would prefer a fully end-to-end model that can take raw information about a deal, typically a 20 page pitch deck and executive summary, and output a score or classification of the deal which would be useful in evaluating the deal. This would involve both highly quantitative and structured data, such as financial metrics about the company, and more unstructured, soft data such as the business model, reviews of the product, management profile, etc. Whereas the former mode of data is easily obtainable through public APIs, it's possible that the latter might have to be collected ourselves through web crawling / scraping.
This project will span two semesters due to its highly open-ended and complex nature, and would heavily involve data cleaning and formatting, NLP, and even computer vision in the initial stages, as well as a workable understanding of finances and investment theory. The problem would be supervised learning, with many examples of past deals that have been classified as successful or not. Eventually, the model should perform at the level of the existing analysts at the firm.
No Team Members