Supported Formats
Finch Fusion supports a variety of document formats for upload and analysis. This guide covers supported file types and their characteristics.Fully Supported Formats
PDF Documents (.pdf)
PDF is the most common format for business and financial documents. Supported features:- Native text extraction
- OCR for scanned documents
- Multi-page documents
- Embedded images and tables
- Annual reports (10-K, 10-Q filings)
- Research papers
- Presentations converted to PDF
- Scanned documents
Word Documents (.docx)
Microsoft Word documents are fully supported. Supported features:- Full text extraction
- Formatting preservation for analysis
- Tables and lists
- Embedded images (text from images may not be extracted)
- Draft reports
- Internal memos
- Written analyses
- Meeting notes
Older
.doc format may have limited support. Convert to .docx for best results.Plain Text Files (.txt)
Simple text files are processed quickly and accurately. Supported features:- Full text extraction
- Fast processing
- Universal compatibility
- Data exports
- Transcripts
- Simple notes
- Log files
Format Comparison
| Format | Text Accuracy | Processing Speed | Best Use Case |
|---|---|---|---|
| PDF (text-based) | Excellent | Fast | Official reports, filings |
| PDF (scanned) | Good | Slower | Physical document digitization |
| DOCX | Excellent | Fast | Draft documents, notes |
| TXT | Perfect | Very Fast | Plain text content |
File Size Considerations
Recommended File Sizes
| File Type | Recommended Max | Notes |
|---|---|---|
| 50 MB | Larger files take longer to process | |
| DOCX | 25 MB | Images increase file size significantly |
| TXT | 10 MB | Very large text files may timeout |
Handling Large Documents
If your document exceeds recommended sizes:- Split the document - Break it into logical sections
- Compress images - Reduce embedded image sizes
- Remove unnecessary pages - Include only relevant content
Optimizing Documents for Upload
PDF Best Practices
- Use text-based PDFs when possible
- Ensure scanned documents are at least 300 DPI
- Flatten complex PDFs before upload
- Remove password protection before uploading
Word Document Best Practices
- Save as
.docxformat (not.doc) - Compress images before adding to the document
- Remove track changes and comments
- Ensure fonts are embedded or use standard fonts
General Tips
- Give files descriptive names
- Ensure documents are not corrupted before upload
- Remove any password protection
- Verify the document opens correctly on your computer first
Unsupported Formats
The following formats are not currently supported:- Spreadsheets (
.xlsx,.xls,.csv) - Presentations (
.pptx,.ppt) - Images (
.jpg,.png,.gif) - Web pages (
.html) - Email files (
.eml,.msg) - Compressed archives (
.zip,.rar)
For unsupported formats, consider converting to PDF or copy-pasting content into a text file before upload.
Format Conversion Tips
If you need to convert documents:Spreadsheets to PDF
- Open in Excel or Google Sheets
- Use Print > Save as PDF
- Upload the resulting PDF
Presentations to PDF
- Open in PowerPoint or Google Slides
- Use File > Export as PDF
- Upload the PDF
Web Pages to PDF
- Open the page in your browser
- Use Print > Save as PDF
- Upload the PDF
Troubleshooting Format Issues
My PDF isn't processing correctly
My PDF isn't processing correctly
The PDF may be image-based without embedded text. Finch Fusion uses OCR for scanned documents, but results depend on image quality. Consider using a higher-resolution scan.
Word document formatting looks wrong
Word document formatting looks wrong
Formatting is used for analysis but not preserved in display. Focus on the text content being extracted correctly rather than visual formatting.
My file type isn't supported
My file type isn't supported
Convert the file to a supported format (preferably PDF) before uploading. Most applications have Export as PDF functionality.