Equity Valuation Engine

Date
Clock 8 min read
Tag
#project #python #yfinance
Equity Valuation Engine

A modular, data-driven framework for financial analysis and intrinsic valuation


Overview

Over the past few months, I designed and implemented a modular Equity Valuation Engine in Python. The goal was straightforward but ambitious:

Build a clean, extensible system that extracts financial data from public sources, structures it into a coherent domain model, computes derived financial metrics, and runs multiple valuation methodologies on top of the same dataset.

This project combines financial modeling, software architecture, data normalization, and applied quantitative analysis. It is designed to be readable for finance professionals while structured in a way that reflects sound engineering principles.


High-Level Goals

The system was built around four core objectives:

  1. Standardize financial data into a strongly structured domain model.
  2. Decouple data sources from valuation logic using a repository pattern.
  3. Support multiple valuation methodologies for cross-model comparison.
  4. Enable reproducibility and extensibility through modular design.

Instead of treating valuation as a spreadsheet exercise, the project formalizes it as a composable software system.


System Architecture

The codebase follows a layered, domain-oriented architecture:

application/ metrics_loader/ valuations/ domain/ metrics/ valuation/ infrastructure/ repositories/ mappers/ calculations/

Layer Responsibilities

  • Infrastructure Layer
    Handles data access (currently viayfinance).
    Responsible for raw extraction and normalization.

  • Domain Layer
    Contains typed financial models (StockMetrics,Financials,BalanceSheet, etc.)
    Houses business logic and ratio calculations.

  • Application Layer
    Orchestrates workflows (loading metrics, running valuation engines).

  • Calculations Module
    Implements financial formulas (WACC, ROIC, DCF math, etc.).

This separation ensures:

  • Clean abstraction boundaries
  • Easier testing
  • Future extensibility (e.g., replacing Yahoo Finance with EDGAR or a paid API)

Core Domain Model:StockMetrics

At the center of the system is a strongly typed dataclass:

StockMetrics

This object aggregates all financial and market data required for valuation.

It contains:

  • CompanyProfile
  • Financials
  • CashFlow
  • BalanceSheet
  • MarketData
  • Valuation
  • HistoricalData
  • Ratios

Each subcomponent is a dedicated dataclass. After initialization, derived metrics are computed automatically (via__post_init__).

Example of Derived Computations

From primitive financial inputs, the system computes:

  • Revenue growth rate
  • Gross / Operating / Net margins
  • ROE, ROIC, ROA
  • Free Cash Flow (FCF)
  • Cost of debt
  • Effective tax rate
  • WACC inputs

This design ensures:

Raw data is extracted once.
Business logic is centralized.
Valuation engines operate on normalized inputs.


Data Extraction: HowyfinanceFeeds the System

The project uses theyfinancelibrary as a data provider. However, it does not tightly couple the system to Yahoo’s raw structure.

Repository Pattern

YfinanceDataLoaderimplements aFinancialRepositoryinterface.

Its responsibilities:

  • Fetch:

    • Income statement
    • Balance sheet
    • Cash flow statement
    • Earnings history
    • Market data
    • Price history
  • Normalize:

    • Inconsistent labels
    • Quarterly vs annual formats
    • Missing values
    • Currency differences

Label Normalization

Yahoo Finance labels vary by ticker and geography.
To mitigate fragility, the system:

  • Normalizes labels (lowercase, underscore formatting)
  • Uses candidate label lists for matching
  • Searches DataFrames programmatically

This makes the extraction logic more robust than direct column indexing.

TTM (Trailing Twelve Months) Assembly

When quarterly data exists:

  • The loader extracts the last four quarters
  • Sums them to compute TTM values
  • Falls back to annual values if needed

This supports realistic DCF inputs based on current performance rather than outdated annual figures.

Currency Handling

The repository includes a conversion step to normalize financial data to USD when required. This avoids valuation distortion due to currency mismatch between financial statements and market pricing.


Valuation Models Implemented

The system supports three independent valuation approaches.


1. Discounted Cash Flow (DCF)

Measures: Intrinsic value based on projected free cash flows.

Core Logic

  1. Project FCF forward using growth assumptions
  2. Compute terminal value using perpetual growth
  3. Discount cash flows using WACC
  4. Divide by shares outstanding

Financial Inputs Used

  • FCF (TTM)
  • Revenue growth
  • Operating margins
  • Cost of debt
  • Tax rate
  • Capital structure

Characteristics

  • Most theoretically grounded
  • Highly sensitive to WACC and terminal growth
  • Suitable for stable, cash-generating businesses

2. P/E Multiple Projection Model

Measures: Future earnings value based on EPS growth and exit multiple.

Core Logic

  1. Project EPS growth
  2. Apply target P/E multiple
  3. Discount back to present value

Characteristics

  • Market-relative valuation
  • Simpler than DCF
  • Multiple assumption dominates outcome

3. ROE-Based Equity Model

Measures: Shareholder value via equity growth and dividend flows.

Core Logic

  1. Grow equity per share using ROE
  2. Model payout/dividends
  3. Discount shareholder cash flows

Characteristics

  • Focuses on capital efficiency
  • Suitable for high-ROE compounders
  • Sensitive to reinvestment assumptions

Differences Between Models

ModelAnchored OnSensitive ToBest For
DCFFree Cash FlowWACC, terminal growthMature, stable firms
P/EEarningsExit multipleComparable-based valuation
ROEReturn on EquityROE persistenceCapital-efficient firms

Running all three provides triangulation rather than a single-point estimate.


Engineering Decisions & Design Tradeoffs

Strengths

  • Clean separation of data, domain, and valuation logic
  • Typed domain model improves reliability
  • Mapper-based extraction reduces hard-coded logic
  • Extensible repository pattern
  • Clear valuation engine boundaries

Known Risks

  • yfinanceis not a guaranteed stable API
  • Label matching is inherently fragile
  • TTM assembly assumes reporting continuity
  • Valuation sensitivity remains high (as expected in finance)

Why This Project Matters

This project demonstrates:

  • Applied financial modeling beyond spreadsheets
  • Practical use of object-oriented design
  • Data normalization and transformation logic
  • Handling real-world API inconsistency
  • Clear separation of concerns in architecture
  • Translating financial theory into maintainable code

It bridges finance and software engineering.

For a technical recruiter, this shows:

  • Domain-driven design thinking
  • Production-minded architecture
  • Practical quantitative implementation
  • Strong abstraction discipline
  • Ability to operationalize financial theory

Future Improvements

Planned enhancements include:

  • Improved validation and logging around missing data
  • Integration with alternative data providers (e.g., EDGAR parsing)
  • Sensitivity analysis engine
  • Monte Carlo valuation scenarios
  • Data quality scoring
  • Automated regression tests against historical snapshots

Closing Thoughts

Valuation is fundamentally about structured reasoning under uncertainty.

This project formalizes that reasoning into a composable system:

  • Extract → Normalize → Model → Value → Compare

Rather than treating finance as a spreadsheet exercise, it treats it as a software problem — with abstractions, interfaces, domain boundaries, and testable components.

It is both a financial modeling engine and an architectural exercise in turning messy real-world data into structured, decision-ready insights.