Compositional Reasoning over Structured and Unstructured Data Using Hybrid Indexing Frameworks

Authors

  • Qingyuan Zhou Department of Computer Science, University of Pittsburgh, USA Author

DOI:

https://doi.org/10.71465/mrcis159

Keywords:

Compositional reasoning, hybrid indexing, structured data, unstructured data, semantic retrieval

Abstract

The exponential growth of heterogeneous data sources has created unprecedented challenges for information retrieval and knowledge extraction systems. Modern enterprises and research institutions routinely manage vast repositories containing both structured databases and unstructured text collections, yet traditional indexing approaches remain siloed in their treatment of these distinct data modalities. This research investigates compositional reasoning mechanisms that enable unified query processing across structured and unstructured data through hybrid indexing frameworks. We propose a novel architecture that integrates semantic embeddings with relational schema representations, employing gating mechanisms to dynamically balance contributions from both modalities. Our methodology combines graph-based knowledge structures with dense vector retrieval systems, implementing attention mechanisms and modular reasoning components that enable flexible query decomposition and execution. Through extensive experiments on enterprise datasets containing financial records, technical documentation, and operational logs, we demonstrate that hybrid indexing frameworks achieve superior performance in multi-hop reasoning tasks compared to single-modality approaches. The proposed system reduces query response time by 34% while improving answer accuracy by 28% on compositional queries requiring integration across database tables and document collections. These findings suggest that unified indexing strategies with compositional reasoning represent a critical enabler for next-generation question answering systems, business intelligence platforms, and knowledge management applications operating in complex data environments.

Downloads

Published

2025-12-25