Assessment Criteria

Measuring the Non-Negotiable Dimensions of Entrepreneurial Success

Supsindex entrepreneurial indices form a structured assessment system designed to evaluate the critical, non-negotiable dimensions of founder capability—the human factors that most strongly determine whether a startup succeeds or fails.

Beyond a “Personality Test”

Why Founder Assessment Matters

Why do startups fail, and what do we mean by the Founder Effect?

Startups rarely fail due to a lack of passion or ambition. They fail because of specific, often measurable blind spots—in judgment, market understanding, behavioral resilience, and decision-making under pressure. Extensive industry analysis consistently shows that over 60% of startup failures are attributable to people-related factors: co-founder conflict, flawed decisions under uncertainty, misalignment with the market or ecosystem, and an inability to adapt when conditions change. These failures occur not because the idea was impossible, but because the human system executing it broke down.

The cost of this blind spot is enormous. Between 2019 and 2024 alone, hundreds of billions of dollars in global venture capital were lost to failures that were, in hindsight, preventable. Yet despite this, most ecosystems still rely on intuition, interviews, and surface-level signals to evaluate founders—the very layer where risk is highest.

What the ecosystem lacks is not ambition or capital, but accurate measurement of the human side of entrepreneurship.

What Supsindex Means by “Founder Soft Power”

In the context of Supsindex, “Soft Power” refers to the non-technical, human capabilities of a founder that materially influence entrepreneurial outcomes.

a founder’s cognitive understanding of entrepreneurship and market mechanics,

their behavioral judgment and decision-making patterns under uncertainty and pressure, and

their awareness of—and fit with—the ecosystem in which they operate.

These capabilities are complex, multidimensional, and dynamic. Existing assessment approaches treat them as fragmented traits—often isolated, difficult to compare, and poorly validated. As a result, decision-making defaults to intuition.

Supsindex exists to replace that intuition with structured, comparable, and scientifically grounded measurement—through a system of indices.

Supsindex Is an Assessment Engine — Not a Quiz

Supsindex is not a standard personality test, nor a lifestyle or “cosmopolitan” quiz designed for virality. It is a founder assessment engine built to measure the critical dimensions of entrepreneurial performance. At its current stage, Supsindex objectively measures three core dimensions that determine whether a founder can operate effectively under real entrepreneurial conditions.

In parallel, Supsindex is deliberately expanding toward a comprehensive framework for assessing entrepreneurial potential. As additional indices are deployed, further dimensions—such as team dynamics, decision-making under simulated pressure, and longitudinal founder growth—will be progressively unlocked, completing a full and evolving map of founder capability.

The Triangulated Assessment Engine (Current State)

As of today, the Supsindex Assessment Engine is triangulated, targeting three foundational pillars of founder soft power at the individual level:

Cognitive Power

Can you understand and structure the problem correctly?

Measures the accuracy, coherence, and practical usability of a founder’s entrepreneurial knowledge and mental models.

Behavioral Judgment

Will you make sound decisions under pressure and uncertainty?

Assesses behavioral patterns, resilience, ethical grounding, bias exposure, and decision quality in stressful or ambiguous conditions.

Ecosystem Fit

Do you understand the rules, norms, and constraints of your target ecosystem?

Evaluates a founder’s awareness of market structures, regulations, cultural dynamics, and ecosystem-specific realities.

These three dimensions are fully measurable today through Supsindex’s live indices!

Supsindex Methodology & Scoring

Supsindex moves beyond simplistic scoring models where every question is treated equally. Instead, the platform employs advanced psychometric and decision-science methods, including:

These methods are widely used in high-stakes domains such as medical licensing, aviation, and military selection—fields where false positives carry unacceptable risk. Supsindex applies the same rigor to entrepreneurship, where the cost of misjudgment is measured in lost capital, wasted time, and unrealized human potential.

Supsindex does not ask only who you think you are. It measures how you are likely to perform when reality applies pressure.

Assessment Engine Pillar 1: FPA

Founder Public Awareness - The Cognitive Engine

Entrepreneurial Literacy

Signal Detection Ability

Cognitive Processing Power

The FPA operates as a “Dynamic Knowledge Engine”. Its question bank adapts based on a founder’s startup stage (e.g., Pre-Seed vs. Series A) and industry context (e.g., EdTech vs. FinTech).

Questions

FPA Measurement Methodology

Using probabilistic models for scoring rather than deterministic

Empirical Weighting via the 2PL IRT Model

Scores reflect true underlying ability

Traditional tests weight every question equally, a scientifically flawed approach that treats trivial definitions (e.g., “What is B2B?”) the same as complex strategic diagnoses that require synthesis across multiple variables.

Supsindex applies a 2-Parameter Logistic (2PL) Item Response Theory (IRT) model, ensuring that scores reflect true underlying ability rather than surface-level correctness.

Difficulty Parameter
Derived empirically from thousands of founder responses, this parameter validates question difficulty based on real performance data—not expert intuition. Founders are rewarded for solving problems that challenge the majority of peers, demonstrating depth rather than surface knowledge.

Discrimination Parameter
Measures how effectively a question separates high-performers from low-performers. Correct answers on highly discriminative items—those strongly correlated with venture success—contribute significantly more to the final score than generic items. This allows the algorithm to learn which questions actually matter.

Signal Detection Theory

(The “Distractor” Mechanism)

A founder’s ability of filtering signals from noise

To measure this executive function, the FPA intentionally includes Distractor questions—content that appears technical or important but is statistically irrelevant to startup success (e.g., trivia about ergonomic chair angles, font ligatures, or obsolete coding syntax).

Using Signal Detection Theory, we calculate D-Prime (d’) sensitivity:

Hit Rate (Sensitivity): Correctly identifying irrelevant noise
False Alarm Rate (Specificity): Incorrectly dismissing a real strategic issue

A high D-Prime score requires both vigilance and discernment. The model penalizes false alarms (ignoring real problems) as heavily as misses (falling for distractions), distinguishing focused founders from those who are inattentive, overly cynical, or guessing.

Assessment Engine Pillar 2: GEB

General Entrepreneur Behavior – The Behavioral Engine

Decision-Making Quality

Resilience Under Pressure

Cognitive Bias Susceptibility

The GEB operates as a “Behavioral Judgment Engine”. Instead of asking founders to describe themselves, it places them inside realistic entrepreneurial situations and evaluates how they choose to act under pressure, uncertainty, and constraint.

Situational Judgment Scenarios

Distributed across 15 core behavioral categories, including:

GEB Measurement Methodology

Using comparative judgment and probabilistic modeling rather than self-reporting or fixed-trait scoring

Traditional personality tests suffer from Ipsative Bias. If a founder rates themselves highly on every positive trait, the assessment loses discriminatory power. Forced-choice formats also introduce artificial negative correlations—for example, appearing low in Adaptability simply because Integrity was prioritized. GEB is designed explicitly to overcome these limitations.

Thurstonian Item Response Theory

Comparative Judgment Modeling

In every entrepreneurial crisis scenario, founders are asked to select:

the Most Effective action
the Least Effective action

This forces real trade-offs under pressure, mirroring actual founder decision-making.

The Problem (Ipsative Data): In conventional scoring, choosing Option A implies not choosing Option B, creating mathematical dependency. This can falsely suppress traits (e.g., appearing “low” in Resilience simply because Ethics was prioritized).

Supsindex applies Thurstonian Item Response Theory to recover each founder’s latent utility values for every behavioral trait. Decisions are modeled as comparisons between underlying utilities rather than fixed choices.

This allows the system to mathematically demonstrate that a founder can be both highly Adaptable and highly Principled, instead of forcing artificial trade-offs.

Dual-Engine Behavioral Analysis

Dual-layer behavioral profile, capturing both strengths and risks.

The Strength Engine: The GEB scoring engine measures 19 positive behavioral constructs that drive entrepreneurial success such as opportunity recognition, recombination, grit, and strategic persistence.

The Risk Engine (Bias Detection): Measures susceptibility to 20 cognitive derailers that frequently undermine early-stage companies.

Distractor Engineering: Incorrect options are intentionally engineered to be socially desirable. For example, a Micromanagement bias may be framed as “Supportive Checking-In.”

Novice founders select it because it sounds virtuous
Experienced founders reject it because it erodes autonomy

High Score = Low Risk: Consistently avoiding these socially desirable traps results in a higher safety score, signaling a preference for effectiveness over appearances.

Assessment Engine Pillar 3: EEA

Ecosystem Environmental Awareness – The Context Engine

Local Market Fit

Resilience Under Pressure

Contextual “Street Smarts”

The EEA operates as a “Contextual Intelligence Engine”. It measures whether a founder understands the specific environment in which they are operating—its rules, constraints, norms, and hidden dynamics—rather than relying on generic startup playbooks. A founder who succeeds in one ecosystem may fail in another if they attempt to apply the same assumptions unchanged. EEA is designed to measure this hyper-local readiness.

Questions

EEA Measurement Methodology

Ensuring cross-ecosystem fairness through equating, contextual benchmarking, and content freshness

Because the EEA generates different question sets for different regions and industries (e.g., US SaaS vs. UK BioTech), raw scores are not directly comparable. This statistical challenge—known as matrix sampling—is resolved through equating.

Anchor Item Equating

Cross-Context Fairness

Content Freshness Control

The Problem: Is a score of 800 in a relatively permissive regulatory environment (e.g., Delaware) equivalent to a score of 800 in a highly constrained one (e.g., Germany)? Raw scores would suggest yes, but the underlying competence required is fundamentally different!

The Solution: Anchor Item Equating. At least 20% of the questions in every EEA test deck are universal anchor items—identical across regions and industries. Performance on these anchors is used to statistically adjust the difficulty of the context-specific questions.

The Result: If anchor performance indicates strong general competence but regional scores appear lower, the system recognizes higher contextual difficulty and adjusts accordingly. This ensures that a score of 800 in a “hard” ecosystem reflects the same level of mastery as a score of 800 in an “easy” ecosystem—making EEA a universally valid measure of local readiness.

Temporal Validity

Content Freshness Control

Ecosystems evolve quickly—especially regulations, compliance frameworks, and institutional norms.

To prevent scoring founders against obsolete information, Supsindex enforces a Time-to-Live (TTL) policy on all EEA questions.

Each question has a defined review cycle
Once expired, it is automatically quarantined
Human domain experts must revalidate it before reuse

This ensures founders are never penalized or rewarded based on outdated standards.

EEA Benchmarking System

Relational Scoring & Normative Data

Each founder is compared only within their Exact Contextual Bucket

Same industry
Same startup stage
Same ecosystem

The normative dataset includes 1,000+ verified founder profiles, stratified for relevance. Benchmarks include:

Contextual Average: Mean score of direct peers
Top Decile (Top 10%): A realistic excellence benchmark, excluding statistical outliers

The “Out of 1000” Scale & Confidence Intervals

Each EEA assessment yields a score out of 1000. But why 1000?

Small margins in contextual understanding compound over time. Differences such as 750 vs. 785 meaningfully correlate with downstream outcomes like enterprise sales readiness, regulatory delays, and survival rates.

Confidence Intervals (Measurement Honesty): Every score is reported with a Confidence Interval (CI) (e.g., 780 ± 15), representing the Standard Error of Measurement (SEM). This acknowledges uncertainty and provides investors and institutions with a scientifically honest range for true capability.

Real-World Sample Scenarios

FPA

Early-stage market validation

Example Challenge: A founder reviews traction data and must distinguish vanity metrics (press mentions, total signups) from actionable indicators (retention, DAU).

Measurement Logic: Signal Detection + IRT-weighted knowledge items evaluate whether the founder correctly filters irrelevant noise without dismissing meaningful signals.

Capability Revealed: Entrepreneurial literacy, cognitive discipline, and depth of understanding under information overload.

GEB

Competitive shock under pressure

Example Challenge: A competitor launches a similar product at a significantly lower price point. The founder must choose the most and least effective response.

Measurement Logic: Thurstonian modeling recovers latent utilities behind choices, detecting both adaptive strategies and cognitive derailers (e.g., panic, overconfidence).

Capability Revealed: Decision-making quality, resilience, bias resistance, and strategic judgment in stressful conditions.

EEA

Market entry & compliance

Example Challenge: A SaaS founder entering the U.S. market must identify which compliance framework (HIPAA, SOC 2, PCI DSS) is required to sell to enterprise clients.

Measurement Logic: Anchor-item equating adjusts difficulty across ecosystems, ensuring scores reflect true contextual mastery rather than regulatory leniency.

Capability Revealed: Ecosystem literacy, regulatory awareness, and readiness to operate within real institutional constraints.

Supsindex Quartile Ranking Framework

Each Supsindex assessment produces a final score that is benchmarked against founders in the same contextual bucket (industry, stage, and ecosystem). Based on this comparison, founders are placed into one of four quartiles, translating raw performance into actionable interpretation.

Top 25%

Investable Grade

Strong alignment with the behavioral, cognitive, and contextual profiles of founders who successfully raise capital and scale ventures. Suitable for immediate investor engagement.

50–75%

High Potential

Solid foundational capability with identifiable, isolated gaps (e.g., strong product insight paired with micromanagement risk). Best addressed through targeted coaching or mentorship.

25–50%

Developing

Noticeable gaps in market understanding, ecosystem awareness, or decision frameworks. Requires structured entrepreneurial education and capability development.

Bottom 25%

Foundational

Elevated risk indicators detected (e.g., low resilience, high cognitive bias exposure). Suggests the need for role reconsideration, skill rebuilding, or complementary co-founder alignment.

Quartile placement is not a label of worth—it is a diagnostic signal, designed to guide better decisions, not gatekeep opportunity.

Fairness & Assessment Ethics

Talent is universally distributed, but opportunity is not

Our assessment engine is intentionally designed to reduce systemic bias and ensure ethical, statistically sound evaluation through the following mechanisms:

Adverse Impact Analysis

We continuously run Differential Item Functioning (DIF) analyses using Mantel–Haenszel statistics. If an item advantages or disadvantages a demographic group (holding ability constant), it is flagged, quarantined, and removed—ensuring fairness across gender, ethnicity, and background.

Model Fit Transparency

Supsindex does not rely on blind trust in algorithms. We actively monitor Model Fit Indices (including RMSEA and M₂ statistics) to confirm that scoring models accurately represent the underlying data structure.

Marginal Reliability

For complex forced-choice instruments such as the GEB, traditional reliability metrics (e.g., Cronbach’s Alpha) are mathematically invalid. Supsindex applies Marginal Reliability coefficients to ensure precision without statistical distortion.

Cultural Neutrality

The Thurstonian structure of the GEB reduces cultural response styles (such as uniform high self-ratings common in some cultures). By forcing comparative judgments, the system measures capability—not cultural expression, enabling global applicability.

Local Item Dependence Control

While classical IRT assumes question independence, real-world scenarios often involve clustered case-based items. Supsindex applies Testlet Response Theory (TRT) to correct for Local Item Dependence (LID), preventing artificial score inflation.

Data Privacy by Design

ounder data belongs to the founder. Supsindex follows Privacy-by-Design principles:

All data is encrypted
Benchmarking usesanonymized aggregates
No behavioral data is sold or shared without explicit consent