SIGIR-KDF 2023 : The 4th Workshop on Knowledge Discovery from Unstructured Data in Financial Services
Call For Papers
Artificial intelligence (AI) and information retrieval (IR) systems and techniques have been widely adopted in financial services to tackle various tasks, such as information retrieval from business documents, retrieval from non-textual content like tables and graphs, recommending financial products and services to customers, providing decision support for investment practices, automating of due diligence protocols, detecting fraudulent transactions, financial sentiment analysis on social media, and understanding Environmental, Social and Governance (ESG) impact on investment practices.
Knowledge from IR systems can help augment human intelligence. However, discovering and extracting the knowledge conveyed inside unstructured financial data, like SEC filings, prospectuses, business reports, and other enterprise documents are extremely challenging due to the massive volume of data, large variation in the data format, low signal-to-noise ratio, scarcity of expert annotated datasets, task ambiguity, hurdles regarding data integrity and privacy, robustness against domain shift, and high-performance requirements set by industry and regulatory standards. Manual extraction of knowledge is usually inefficient, error-prone, and inconsistent, so it is one of the key technical bottlenecks for financial services companies to accelerate their operating productivity. These challenges and issues call for robust artificial intelligence, information retrieval, and machine learning algorithms and systems to help. The automated processing of unstructured data to discover knowledge from complex financial documents requires bringing together a suite of techniques such as natural language processing, information retrieval, semantic analysis, and complex reasoning. In addition, how knowledge is captured and represented, synthesized across diverse sources, and used within AI systems, is crucial to developing effective solutions in financial services.
Furthermore, based on the reflections and feedback from our past KDF workshops, the 2023 workshop is particularly interested in multi-modal understanding of financial documents, retrieving and reasoning over tabular data within financial documents, and financial domain-specific representation learning. The workshop will be composed of three components: invited talks, paper presentations, along with a shared task competition. We cordially welcome researchers, practitioners, and students from academic and industrial communities who are interested in the topics to participate and/or submit their original work.
The scope of the workshop includes, but is not limited to, the following areas:
AI and IR technologies for business document understanding for financial corporations, including searching and question answering systems, understanding and reasoning over non-textual content such as tables and graphs;
representation learning, and distributed representation learning and encoding in natural language processing for financial documents;
language modeling on financial corpora including tabular and numerical data, and multi-modal modeling;
multi-source knowledge integration and fusion, and knowledge alignment and integration from heterogeneous data;
reconciling unstructured knowledge with structured knowledge and human expertise;
named-entity disambiguation, recognition, resolution, relationship discovery, ontology learning and extraction in financial and business documents;
AI-assisted domain data tagging, labeling, and annotation for IR tasks; automatic data extraction from financial filings and quality verification;
corporate ESG event discovery, evaluation, and impact assessment;
event discovery from alternative data and impact on corporate equity pricing;
AI and IR systems for financial risk assessment on financial legal documents such as contracts and prospectuses;
verifying facts and statements generated by large pre-trained language models using IR and knowledge discovery;
IR or QA techniques and applications on financial documents leveraging large language models.
Although textual data is prevalent in a large amount of finance-related business problems, we also encourage submissions of studies or applications pertinent to finance using other types of unstructured data such as financial transactions, sensors, mobile devices, satellites, social media, etc.
We invite submissions of relevant work that be of interest to the workshop. All submissions must be original contributions that have not been previously published and that are not currently under review by other conferences or journals. Submissions will be peer reviewed, single-blinded. Submissions will be assessed based on their novelty, technical quality, significance of impact, interest, clarity, relevance, and reproducibility. All submissions must be in PDF format and follow the current ACM two-column conference format. We accept two types of submissions:
full research paper: no longer than 9 pages (including references, proofs, and appendixes).
short/poster paper: no longer than 4 pages(including references, proofs, and appendixes).
Submission will be accepted via Microsoft CMT. All accepted papers will be presented in the workshop. Submission will be non-archival, and the authors may post their work on arXiv or other online repositories.