Abstract
Publicly traded companies in the U.S. submit annual 10-K filings to the EDGAR SEC system, providing comprehensive details about financial conditions, business strate- gies, and corporate actions such as Mergers and Acquisitions (M&A). Traditional meth- ods for analyzing 10-K filings are manual and time-consuming, limiting scalability in corporate due diligence and financial research. This paper introduces a novel auto- mated approach that leverages Legal-BERT for entity extraction, semantic filtering using GloVe embeddings, and contextual filtering with sentence embeddings to iden- tify key entities and provisions related to M&A activities. Our hybrid method achieves a significant improvement in F1-score (up to 87.8%) compared to using Legal-BERT alone (78.2%) by effectively reducing false positives while maintaining high recall. This approach offers a scalable solution for financial analysis, compliance monitoring, risk assessment, and competitive analysis.