Compilation Pipeline

Overview

The prompd compilation system implements a two-stage approach to maximize context preservation for LLMs while maintaining absolute security:

Package-Time Conversion: All potentially executable or binary files are converted to safe .ext.txt formats
Compile-Time Integration: Original formats are detected and content is wrapped in appropriate markdown code blocks

This strategy provides maximum context (LLMs see full source code) with zero execution risk (no executable files exist in packages).

Package-Time File Conversion

Security Through Renaming

When creating packages, all potentially dangerous or binary files are automatically converted:

# Original files in source directory:
src/
├── auth.js              # JavaScript file
├── styles.css           # CSS stylesheet
├── config.html          # HTML template
├── report.pdf           # PDF document
└── analysis.cpp         # C++ source

# After package creation (.pdpkg contents):
src/
├── auth.js.txt          # Safe text version
├── styles.css.txt       # Safe text version
├── config.html.txt      # Safe text version
├── report.pdf.txt       # Extracted text content
└── analysis.cpp.txt     # Safe text version

Binary File Processing

Document Extraction

document.pdf       → document.pdf.txt          # Text extracted from PDF
presentation.pptx  → presentation.pptx.txt     # Text extracted from slides
spreadsheet.xlsx   → spreadsheet-sheet1.csv     # First sheet as CSV
                   → spreadsheet-sheet2.csv     # Second sheet as CSV

Image Files

# Images kept as-is (safe for documentation)
diagram.png → diagram.png    # No conversion needed
logo.jpg    → logo.jpg       # No conversion needed

Filename Conflict Resolution

If conflicts exist between original files and converted names:

# Source has both:
report.txt     # Original text file
report.pdf     # PDF document

# Package creates:
report.txt       # Original kept as-is (already safe)
report-pdf.txt   # PDF converted with disambiguated name

Compile-Time Context Integration

Automatic Format Detection

During compilation, the system detects original formats from filenames and wraps content in appropriate markdown code blocks:

Filename Pattern	Detected Format	Code Block Language
`*.js.txt`	JavaScript	`javascript`
`*.ts.txt`	TypeScript	`typescript`
`*.py.txt`	Python	`python`
`*.go.txt`	Go	`go`
`*.cs.txt`	C#	`csharp`
`*.cpp.txt`	C++	`cpp`
`*.java.txt`	Java	`java`
`*.rb.txt`	Ruby	`ruby`
`*.rs.txt`	Rust	`rust`
`*.html.txt`	HTML	`html`
`*.css.txt`	CSS	`css`
`*.json.txt`	JSON	`json`
`*.yaml.txt`	YAML	`yaml`

Code Block Wrapping Example

A file named auth.js.txt containing:

function authenticate(user, password) {
    return jwt.sign({ id: user.id }, secret);
}

Gets compiled into a properly fenced code block with javascript syntax highlighting, preserving full context for LLM analysis.

Security Through Conversion

Before Conversion (Unsafe)

malicious-package.pdpkg/
├── legitimate.prmd      # Safe prompt file
├── exploit.js           # Executable JavaScript
├── backdoor.php         # Executable PHP
└── virus.exe            # Executable binary

After Conversion (Safe)

malicious-package.pdpkg/
├── legitimate.prmd      # Safe prompt file
├── exploit.js.txt       # Safe text - cannot execute
├── backdoor.php.txt     # Safe text - cannot execute
└── virus.exe.txt        # Safe text - cannot execute

LLMs still receive full context through code blocks, but no executable files can ever exist inside a package. This approach combines maximum context preservation with zero security risk through intelligent file conversion and compile-time integration.

Multi-Language Project Example

Source Directory

web-app/
├── frontend/
│   ├── app.js
│   ├── styles.css
│   └── index.html
├── backend/
│   ├── server.py
│   └── config.yaml
├── docs/
│   ├── api.md
│   └── architecture.pdf
└── README.md

Package Contents After Conversion

web-app.pdpkg/
├── frontend/
│   ├── app.js.txt       # Converted JavaScript
│   ├── styles.css.txt   # Converted CSS
│   └── index.html.txt   # Converted HTML
├── backend/
│   ├── server.py.txt    # Converted Python
│   └── config.yaml      # Safe (kept as-is)
├── docs/
│   ├── api.md           # Safe (kept as-is)
│   └── architecture.pdf.txt  # Extracted text
├── README.md            # Safe (kept as-is)
└── manifest.json        # Package metadata

During compilation, each converted file is automatically wrapped in the correct code block format, giving LLMs complete context about the codebase structure and implementation details.