Unstructured.io provides open-source components for pre-processing text documents such as PDFs, HTML, and Word Documents into structured formats for downstream AI applications.
๐ป Technology Stack
๐ย Core ML Skills Demonstrated
Document Understanding & OCR: Advanced image processing, hybrid OCR systems (Tesseract + PaddleOCR), layout-aware text extraction.
Computer Vision: Custom layout detection (YOLOX, YOLO-NAS, DocLayNet), element ordering using xy-cut, image block extraction.
Vision-Language Models (VLMs): Integrated VLMs (Claude, GPT-4o, Gemini) for complex layout comprehension, table extraction using prompt engineering.
LLM Systems: Multi-agent LLM orchestration using LangChain and Model Context Protocol (MCP), custom API integration.
ETL Pipeline Design: End-to-end system creation from unstructured to structured data.