Filtering Company Data

Core concepts of Vainu's Query Language (VQL)

Filter and retrieve company data from Vainu's Nordic business databases using the Organizations and Domains API endpoints.

Overview using a simple example

This example demonstrates the simplest way to query the API. In the payload we define "what companies we want to find" and "what data-points are returned".

import requests
payload = {
    # What companies are you searching for?
    "query": {"?GTE": {"financial_data.revenue": 1000000}},
    # In what country:
    "database": "FI",
    # What datapoints should be returned:
    "fields": [
        "business_id",
        "name",
        "website",
        "company_name",
        "business_units.visiting_address"
    ],
    # How many results:
    "limit": 1
}
response = requests.post(
    "https://api.vainu.io/api/v3/organizations/",
    headers={"Authorization": "Bearer YOUR_ACCESS_TOKEN"},
    json=payload
)
print(response.json())

query

What companies to get. In this example we get the companies in Finland with latest reported revenue greater than or equal to 1000000 EUR (Finnish local currency).

"query": {
  "?OPERATOR": {
    "field_name": value
  }
}

fields

What data is returned with the companies that match query. Check

https://vainu.app/data-catalogue for all the options.

database

Specifies which country to search (FI for Finland, SE for Sweden, NO for Norway, DK for Denmark, or NL for the Netherlands).

limit

Number of results returned — set to 1 here since we're looking up a single company..

API Endpoints

EndpointDescription
POST https://api.vainu.io/api/v3/organizations/Filter and retrieve company data by business registry (country-specific)
POST https://api.vainu.io/api/v3/organizations/count/Get result count without returning data. Payload is identical to sync Organizations API. fields/offset/limit are ignored.
POST https://api.vainu.io/api/v3/organizations/async/To get a lot of data use async. Payload is identical to sync Organizations API: [https://api.vainu.io/api/v3/organizations/](https://api.vainu.io/api/v3/organizations/)

Query Structure

Every request body is a JSON object with these parameters:

ParameterTypeRequiredDefaultDescription
queryobjectYesFilter conditions using query operators
fieldsarrayYesList of field names to include in the response
databasestringYes (organizations only)Country database code: FI, SE, NO, DK, NL
limitintegerNoMaximum results to return
offsetintegerNo0Number of results to skip (for pagination)
orderstringNonullField name to sort by. Prefix with - for descending order
is_activebooleanNoyesFilter for active/inactive companies
unwind_subdocumentstringNonullSubdocument field to unwind (flatten) into separate rows
unwind_subdocument_queryobjectNonullFilter conditions applied to the unwound subdocument
unwind_subdocument_limitintegerNonullMaximum unwound rows to return
unwind_subdocument_offsetintegerNonullNumber of unwound rows to skip
aggregationobjectNonullAggregation pipeline configuration
lookupobjectNonullLookup (join) configuration

Query Operators

Operators define how to match values. They start with ? followed by the operator name:

Logical Operators

OperatorDescriptionExample
?ALLAND — all conditions must match{"?ALL": [condition1, condition2]}
?ANYOR — at least one condition must match{"?ANY": [condition1, condition2]}
?NOTNegates a condition{"?NOT": {"?EQ": {"field": "value"}}}

Comparison Operators

Operator

Description

Example

?EQ

Equal to

{"?EQ": {"financial_data.employees.absolute_count": 50}}

?GT

Greater than

{"?GT": {"financial_data.employees.absolute_count": 50}}

?GTE

Greater than or equal to

{"?GTE": {"financial_data.employees.absolute_count": 50}}

?LT

Less than

{"?LT": {"financial_data.employees.absolute_count": 50}}

?LTE

Less than or equal to

{"?LTE": {"financial_data.employees.absolute_count": 50}}

?IN

Value is in a list

{"?IN": {"status": ["active", "pending"]}}

?RANGE

Value is within [min, max]

{"?RANGE": {"financial_data.revenue": [1000000, 5000000]}}

{"?RANGE": {"financial_data.revenue": [1000000, null]}} identical to{"?GTE": {"financial_data.revenue": 1000000}}


String Operators

OperatorDescriptionExample
?CONTAINSContains substring (case-sensitive){"?CONTAINS": {"name": "tech"}}
?ICONTAINSContains substring (case-insensitive){"?ICONTAINS": {"name": "TeCH"}}
?STARTSWITHStarts with value (case-sensitive){"?STARTSWITH": {"name": "Nordic"}}
?ENDSWITHEnds with value{"?ENDSWITH": {"domain": ".no"}}

?STARTSWITH allows searching with either a single string or a list of values.

Search companies that use any Salesforce technologies:

{"?STARTSWITH": {"technology_data.name": "Salesforce"}}

Search companies that use any Salesforce or HubSpot technologies:

{"?STARTSWITH": {"technology_data.name": ["Salesforce", "HubSpot"]}}

Special Operators

OperatorDescriptionExample
?MATCHMatches conditions within a subdocument array (e.g., contacts, addresses). All conditions inside must match the same subdocument entry.See Contact Search
?EXISTSCheck if a field exists{"?EXISTS": {"website": true}}

Examples


Example 2: Software Companies OR Consulting Firms with Revenue 1M-10M

This example demonstrates mixing ?ALL and ?AND for more complex query logic:

payload = {
    "query": {
        "?ALL": [
            {
                "?ANY": [
                    {"?EQ": {"official_industries.code": ["62"]}},  # Software
                    {"?EQ": {"official_industries.code": ["63"]}},  # IT Services
                    {"?EQ": {"official_industries.code": ["70"]}}   # Consulting
                ]
            },
            {"?RANGE": {"financial_data.revenue": [1000000, 10000000}]}  # Revenue >= 1M
        ]
    },
    "database": "FI",
    "fields": ["business_id", "name", "financial_data", "official_industries"],
    "limit": 100,
}
response = requests.post(organizations_endpoint, headers=headers, json=payload)
print(response.json())

Example 3: Complex Contact Search

Find companies with C-level contacts matching specific criteria. The ?MATCH operator ensures all conditions apply to the same contact. Without?MATCH it's possible to match to company where one contact has first condition and another contact has second condition.

payload = {
    "query": {
        "?ALL": [
            {"?MATCH": {
                "contacts": {
                    "?ALL": [
                        {"?CONTAINS": {"titles.title": "CEO"}},
                        {"?ENDSWITH": {"email": "@company.com"}}
                    ]
                }
            }}
        ]
    },
    "fields": [
        "business_id",
        "name",
        "contacts.full_name",
        "contacts.email",
        "contacts.titles"
    ],
    "database": "FI",
    "limit": 20
}

response = requests.post(organizations_endpoint, headers=headers, json=payload)
print(response.json())

Example 4: Location-Based Search with Multiple Conditions

Combine location, industry, and employee count filters. This query finds technology companies with main location from the official registry in Helsinki with more than 50 employees.

payload = {
    "query": {
        "?ALL": [
            # More that 10 employees:
            {"?GTE": {"financial_data.employees.absolute_count": 10}},
            # Business unit with any type in Helsinki (56111)
            {"?MATCH": {
                "business_units": {
                    "?ALL": [
                        {"?EQ": {"visiting_address.city": "Helsinki"}},
                        {"?EQ": {"types": "registry_main_location"}}
                    ]
                }
            }},
						# Restaurants and cafeterias (56111)
            {"?EQ": {"official_industries.code": "56111"}},
        ]
    },
    "fields": [
        "business_id",
        "name",
        "staff_number",
        "prospect_addresses.visiting_city",
        "website"
    ],
    "database": "FI",
    "limit": 20
}

response = requests.post(organizations_endpoint, headers=headers, json=payload)
print(response.json())

Example 5: Financial Data Filtering with Sort Order

Query companies by multiple financial metrics and sort results by revenue descending. Use null in a ?RANGE to leave one bound open (e.g., revenue above 1M with no upper limit).

payload = {
    "query": {
        "?ALL": [
            {"?RANGE": {"financial_data.revenue": [1000000, None]}},
            {"?GT": {"financial_data.profit": 100000}},
            {"?RANGE": {"financial_data.employee_count": [10, 50]}}
        ]
    },
    "fields": [
        "business_id",
        "name",
        "financial_data.revenue",
        "financial_data.profit",
        "financial_data.employee_count",
        "financial_statements"
    ],
    "database": "FI",
    "order": "-financial_data.revenue",
    "limit": 20
}

response = requests.post(organizations_endpoint, headers=headers, json=payload)
print(response.json())

Example 6: Finding companies that mention "AI Agents" on their website

website_data.keywords stores n-grams (shingles) of phrases and keywords found on a company's website.

What are n-grams?

In this context n-grams are single keywords or multi word phrases found in the website. In website_data.keywords, each token can contain up to 4 words. For example, you can search for companies that have "Company Data is Fun" on their website, but you cannot search for "Company Data is Fun and Interesting".

Features of website_data.keywords data point:

  • Always case-insensitive
  • Maximum n-gram size is 40 characters
  • Maximum n-gram length is 4 words
  • Minimum n-gram length is 4 characters (For example "AI" is not included but "AI Agent" is)
  • All non-ASCII characters are removed except for "ä ö å æ ø"

Full example

This example searches for companies in Sweden that mention any of these phrases on their website: "AI Agent", "Autonomous Agent", "Agentic AI", "Agentic Workflows", or "AI Workforce".

In other words any of the n-grams we found int the website starts with these phrases.

payload = {
    "query": {
        "?ALL": [
            {"?STARTSWITH": {"website_data.keywords": ["AI Agent", "Autonomous Agent", "Agentic AI", "Agentic Workflows", "AI Workforce"]}}
        ]
    },
    "fields": [
        "business_id",
        "name",
    ],
    "database": "SE",
    "limit": 20
}

response = requests.post(organizations_endpoint, headers=headers, json=payload)
print(response.json())

Advanced Features

Vehicle Data Search

Vehicle data fields require special API permissions. Contact Vainu support to enable access.

Find companies with specific vehicle types in their fleet. ?MATCH here ensures that the same vehicle matches all the conditions.

payload = {
    "query": {
        "?ALL": [
            {"?MATCH": {
                "vehicles": {
                    "?ALL": [
                        {"?EQ": {"vehicle_class": "N1"}},
                        {"?EQ": {"brand_human_readable": "Volvo"}}
                    ]
                }
            }}
        ]
    },
    "fields": [
        "business_id",
        "name",
        "vehicles"
    ],
    "database": "FI",
    "limit": 10
}

response = requests.post(organizations_endpoint, headers=headers, json=payload)
print(response.json())

List Filtering

Lists can be created in the Vainu Platform To get all the companies in your vainu list using API with selected datapoints.

payload = {
    "list": LIST_ID
    "fields": [
        "business_id",
        "name",
        "website"
    ],
    "limit": 20
}
response = requests.post(organizations_endpoint, headers=headers, json=payload)
print(response.json())

Include target group as a part of your query:

modifications_range = [datetime.datetime.now()-datetime.timedelta(days=7), datetime.datetime.now()]
payload = {
    "query": {
        "?ALL": [
            {"?IN": {"target_group._id": [YOUR_VAINU_LIST_ID]}},
            {
                "?ANY": [
                    {"?RANGE": {"modifications.basic": modifications_range}},
                    {"?RANGE": {"modifications.financial_statements": modifications_range}},
                ],
            },
        ]
    },
    "fields": [
        "business_id",
        "name",
        "website"
    ],
    "database": "FI",
    "limit": 20
}

response = requests.post(organizations_endpoint, headers=headers, json=payload)
print(response.json())

Exporting Data in Different Formats

Append a format query parameter to the endpoint URL to receive results as CSV, JSONL, or XLSX instead of the default JSON.

# Export as CSV
csv_response = requests.post(
    f"{organizations_endpoint}?format=csv",
    headers=headers,
    json=payload
)

# Export as XLSX
xlsx_response = requests.post(
    f"{organizations_endpoint}?format=xlsx",
    headers=headers,
    json=payload
)

# Export as JSON Lines (one JSON object per line)
jsonl_response = requests.post(
    f"{organizations_endpoint}?format=jsonl",
    headers=headers,
    json=payload
)

Important Notes

❗️
  • Some fields (e.g., vehicle data) require special API permissions.
  • Rate limits apply to all endpoints. Avoid excessive concurrent requests.
  • Always specify the correct database code (FI, SE, NO, DK, NL) matching the country of the companies you're querying.
  • For large datasets (10,000+ results), use the async endpoints to avoid request timeouts.
  • Contact Vainu support for questions about field availability and permission requirements.