Foundation in Formation | The Open Company Data Standard
Foundation Transition Pro Access

STANDARD
v1.1 SPECIFICATION

The definitive open schema for normalized corporate registry data. Designed for strict interoperability, legal safety, and citation at scale.

Why This Standard Exists

Company data is currently fragmented across thousands of local formats and proprietary silos. Central.Enterprises provides a unified global layer that serves as public infrastructure for economic development, academic research, and system interoperability. By normalizing data into a single, predictable format, we enable anyone to build reliable tools on top of the world's business registry.

Two-Tier Data Model

To balance public transparency with commercial viability and safety, we operate two strict data tiers that share identical structure (same rows, same columns).

1. OpenData Tier

OpenData

Published for education, research, and public interest. Sensitive or proprietary fields (like direct contacts) are masked to ensure safety while maintaining structural integrity.

2. Premium (Pro) Tier

Premium

Designed for operational intelligence. Includes full enrichment values for all columns, such as verified direct emails, active social signals, and spending indicators.

Global Identity Strategy

Identifying a unique company globally is a complex challenge. We employ a tripartite strategy:

  • Stable ID (sdc_id): An internal alphanumeric identifier for long-term tracking, ensuring references remain stable across merges or updates.
  • Digital Identity (Domain): We treat a company's primary website domain as its most practical global key, enabling reproducible research on digital presence.
  • Official Registry (Canonical): We normalize local government identifiers (CIF, EIN, CRN) into a standardized COUNTRY-NUMBER format to support cross-border economic analysis.

Core Schema v1.1

The Standard enforces a fixed 37-column layout. The column order is immutable to ensure CSV parser compatibility.

IDX FIELD NAME OPENDATA STATUS DESCRIPTION
1 sdc_id OPEN Stable internal UUID.
2 name OPEN Legal registered name.
3 website OPEN Primary domain (normalized).
4 country_code OPEN ISO 3166-1 alpha-2.
5 province_region OPEN Administrative region.
6 city OPEN City or municipality.
7 reg_number MASKED Local identifier (e.g. B12345678).
8 reg_number_type MASKED Label of ID (e.g. "NIF").
9 reg_number_country MASKED Country of registration.
10 reg_number_canonical MASKED Format: CC-NUMBER.
11 main_category OPEN Primary taxonomy classification.
12 categories OPEN Full taxonomy list (pipe-separated).
13 RSS OPEN News feed URL.
14 emails MASKED Verified contact emails.
15 phone MASKED Main switchboard number.
16 address MASKED Full postal address.
17 CentralRating MASKED Internal trust score.
18 description MASKED Company description text.
19-20 workday_timing / closed_on MASKED Operational hours data.
21 featured_image MASKED Primary brand image URL.
22-23 is_spending_on_ads / competitors MASKED Commercial signals.
24-37 Social Fields (LinkedIn...Medium) MASKED 14 platform-specific columns.

Technical Specifications

  • Format: CSV (Comma Separated)
  • Encoding: UTF-8
  • Quoting: RFC 4180 Compliant
  • Line Break: CRLF or LF
  • True/False: output as true / false
  • Empty Values: Empty string ""
  • Dates: ISO 8601 (YYYY-MM-DD)
  • Multivalue: Pipe separator |

Normalization Rules: Websites are canonicalized (lowercase host). Country codes are strict ISO 3166-1 alpha-2. Registry numbers are stripped of delimiters for the canonical field. Builds are deterministic; the same input always yields the same SHA-256 hash.

Dataset Manifest & Provenance

Every distribution is accompanied by a dataset_manifest.json providing cryptographic proof of integrity and build metadata.

{
  "schema_version": "sdc-1.1",
  "tier": "opendata",
  "build_id": "2026-01-09-PUBLIC",
  "generated_at": "2026-01-09T12:00:00Z",
  "files": [
    {
      "filename": "companies_es.csv",
      "rows": 52000,
      "sha256": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
    }
  ],
  "maintainer": "Central.Enterprises Foundation",
  "license": "CC BY 4.0",
  "notes": "Masked fields are exported as empty strings."
}

Provenance: OpenData values are populated only when the source provenance allows for open redistribution. If the data source is restrictive, the value is masked in the Open tier but preserved in the Premium tier, ensuring total legal compliance.

Data Examples

CSV Header

sdc_id,name,website,country_code,province_region,city,reg_number,reg_number_type...[full list]...Medium

OpenData Row Example

"550e8400-e29b-41d4-a716-446655440000","Acme Corp","acme.com","US","California","San Francisco","","","","", "Technology","Tech|Software","https://acme.com/rss","","","","","","","","","","","","","","","","","","","","","","","",""

Premium Row Example

"550e8400-e29b-41d4-a716-446655440000","Acme Corp","acme.com","US","California","San Francisco","12-3456789","EIN","US","US-123456789", "Technology","Tech|Software","https://acme.com/rss","contact@acme.com","+1-555-0199","1 Market St","A+","Global leader in widgets...","09:00-17:00","Sun","https://acme.com/logo.png","true","WidgetCo|GadgetInc","acme-corp","@acme","acme","acmevideo",...
ACTIONS
Browse Data Get Premium Access
VERSIONING

Current: v1.1 (Stable)

We adhere to semantic versioning. 1.0 -> 1.1 implies backward-compatible field additions. 2.0 would imply breaking changes.