From idea to architecture: a technical deep dive into building an AI-assisted CV optimization platform using LLM APIs.


Introduction

Modern hiring pipelines are increasingly mediated by Applicant Tracking Systems (ATS), transforming CVs from human-readable narratives into machine-parsed datasets. This shift introduces a structural and semantic challenge: a CV must be both expressive for humans and optimized for automated filtering systems.

YaAICV (Yet Another AI CV) was designed to address this duality. It is not just a CV generator, but a structured optimization engine that leverages Large Language Models (LLMs) to align candidate profiles with job descriptions—while preserving user intent and factual accuracy.

This article presents a technical deep dive into the architecture, design decisions, and engineering trade-offs behind the system.


Problem Framing

At its core, the problem can be modeled as a semantic alignment task between two documents:

  • A structured representation of a candidate’s experience
  • A semi-structured job description

The objective is to maximize alignment across:

  • Terminology
  • Skill representation
  • Contextual relevance

However, three constraints complicate this:

  1. ATS systems impose strict formatting and parsing rules
  2. Manual CV customization does not scale
  3. LLMs introduce probabilistic and sometimes unreliable outputs

The system must therefore balance determinism, flexibility, and interpretability.


Data Modeling: CV as Structured Data

A key design decision was to treat CVs as structured objects rather than free text.

{
  "experience": [
    {
      "role": "Software Engineer",
      "description": "...",
      "technologies": ["Python", "Docker"]
    }
  ],
  "skills": ["Python", "Machine Learning"],
  "education": [],
  "projects": []
}

This approach enables:

  • Section-level optimization
  • Fine-grained rewriting
  • Deterministic transformations

It also simplifies downstream AI interactions, as prompts can be constructed with clear boundaries and context.


AI Abstraction Layer

The system defines a generic interface for LLM providers:

class AIProvider:
    def analyze(self, prompt: str) -> str:
        raise NotImplementedError

Concrete implementations can target different APIs (OpenAI, Anthropic, local inference engines), allowing:

  • Cost-aware routing
  • Fallback strategies
  • Model experimentation without refactoring core logic

This abstraction becomes essential in production environments where pricing, latency, and rate limits vary significantly across providers.


The Optimization Pipeline

The optimization process is not a single LLM call, but a pipeline of transformations combining deterministic logic and probabilistic inference.

Pipeline Flow

[ CV Input ]
      |
      v
[ Normalization ]
      |
      v
[ Job Description Analysis ]
      |
      v
[ Semantic Comparison ]
      |
      v
[ LLM Suggestions ]
      |
      v
[ Human Validation ]

Step 1: Normalization

Structured CV data is converted into a normalized textual representation. This step removes formatting inconsistencies and ensures predictable tokenization.

Step 2: Job Description Analysis

The job description is processed to extract key signals:

  • Required skills
  • Action verbs
  • Domain-specific terminology

Step 3: Semantic Comparison

result = analyzer.compare(
    cv_text=my_cv,
    job_offer=job_description
)

print(result.match_score)
print(result.missing_keywords)

The scoring algorithm combines:

  • Keyword overlap
  • Heuristic weighting
  • Contextual similarity (via LLM or embeddings)

Step 4: Suggestion Generation

The LLM is used selectively to:

  • Rewrite experience entries
  • Suggest missing skills
  • Improve phrasing

Prompt design is critical here. Prompts enforce:

  • Truthfulness
  • Brevity
  • Structural constraints

Step 5: Human Validation

All generated content is exposed to the user for validation. This ensures that optimization does not compromise authenticity.


Prompt Engineering and Control

LLMs are inherently stochastic. Without constraints, they tend to:

  • Hallucinate experience
  • Over-generalize
  • Introduce irrelevant keywords

To mitigate this, YaAICV applies:

  • Strict prompt templates
  • Output length constraints
  • Post-processing filters

Example:

Rewrite the following experience to better match the job description.
Do not add new facts. Keep it under 3 bullet points.

This hybrid approach—LLM + deterministic validation—ensures reliable outputs.


PDF Generation Strategy

One non-obvious challenge was PDF generation.

Client-side approaches (e.g., browser rendering) resulted in:

  • Layout inconsistencies
  • Broken pagination
  • ATS parsing failures

The final solution uses server-side HTML rendering, ensuring:

  • Consistent typography
  • Predictable layout
  • ATS-friendly output

This decision significantly improved real-world usability.


Performance and Cost Considerations

LLM usage introduces both latency and cost concerns.

YaAICV mitigates these through:

  • Prompt minimization
  • Selective inference (only reprocessing modified sections)
  • Response caching
  • Multi-provider routing

This design allows the system to scale without becoming prohibitively expensive.


Design Trade-offs

Several trade-offs shaped the final architecture:

  • Flexibility vs Control: More AI freedom leads to better phrasing but lower reliability
  • Structure vs Usability: Structured CVs are powerful but require more user input
  • Accuracy vs Cost: Higher-quality models improve output but increase operational cost

The system intentionally favors control and transparency over full automation.


Future Directions

The current architecture opens the door to several enhancements:

  • Embedding-based semantic search
  • Vector similarity scoring
  • Fine-tuned domain-specific models
  • Multi-language CV optimization
  • Integration with job platforms

Conclusion

YaAICV demonstrates a broader principle in AI system design:

The value of LLMs is not in replacing logic, but in augmenting it.

By combining structured data, deterministic processing, and controlled LLM usage, it is possible to build systems that are both powerful and reliable.

Rather than treating AI as a black box, YaAICV treats it as a pluggable, observable component—one that enhances human decision-making without replacing it.


Project Repository: https://github.com/pernastefano/yaaicv