Filtering Company Data
Core concepts of Vainu's Query Language (VQL)
Filter and retrieve company data from Vainu's Nordic business databases using the Organizations and Domains API endpoints.
Overview using a simple example
This example demonstrates the simplest way to query the API. In the payload we define "what companies we want to find" and "what data-points are returned".
import requests
payload = {
# What companies are you searching for?
"query": {"?GTE": {"financial_data.revenue": 1000000}},
# In what country:
"database": "FI",
# What datapoints should be returned:
"fields": [
"business_id",
"name",
"website",
"company_name",
"business_units.visiting_address"
],
# How many results:
"limit": 1
}
response = requests.post(
"https://api.vainu.io/api/v3/organizations/",
headers={"Authorization": "Bearer YOUR_ACCESS_TOKEN"},
json=payload
)
print(response.json())query
queryWhat companies to get. In this example we get the companies in Finland with latest reported revenue greater than or equal to 1000000 EUR (Finnish local currency).
"query": {
"?OPERATOR": {
"field_name": value
}
}fields
fieldsWhat data is returned with the companies that match query. Check
https://vainu.app/data-catalogue for all the options.
database
databaseSpecifies which country to search (FI for Finland, SE for Sweden, NO for Norway, DK for Denmark, or NL for the Netherlands).
limit
limitNumber of results returned — set to 1 here since we're looking up a single company..
API Endpoints
| Endpoint | Description |
|---|---|
POST https://api.vainu.io/api/v3/organizations/ | Filter and retrieve company data by business registry (country-specific) |
POST https://api.vainu.io/api/v3/organizations/count/ | Get result count without returning data. Payload is identical to sync Organizations API. fields/offset/limit are ignored. |
POST https://api.vainu.io/api/v3/organizations/async/ | To get a lot of data use async. Payload is identical to sync Organizations API: [https://api.vainu.io/api/v3/organizations/](https://api.vainu.io/api/v3/organizations/) |
Query Structure
Every request body is a JSON object with these parameters:
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
query | object | Yes | Filter conditions using query operators | |
fields | array | Yes | List of field names to include in the response | |
database | string | Yes (organizations only) | Country database code: FI, SE, NO, DK, NL | |
limit | integer | No | Maximum results to return | |
offset | integer | No | 0 | Number of results to skip (for pagination) |
order | string | No | null | Field name to sort by. Prefix with - for descending order |
is_active | boolean | No | yes | Filter for active/inactive companies |
unwind_subdocument | string | No | null | Subdocument field to unwind (flatten) into separate rows |
unwind_subdocument_query | object | No | null | Filter conditions applied to the unwound subdocument |
unwind_subdocument_limit | integer | No | null | Maximum unwound rows to return |
unwind_subdocument_offset | integer | No | null | Number of unwound rows to skip |
aggregation | object | No | null | Aggregation pipeline configuration |
lookup | object | No | null | Lookup (join) configuration |
Query Operators
Operators define how to match values. They start with ? followed by the operator name:
Logical Operators
| Operator | Description | Example |
|---|---|---|
?ALL | AND — all conditions must match | {"?ALL": [condition1, condition2]} |
?ANY | OR — at least one condition must match | {"?ANY": [condition1, condition2]} |
?NOT | Negates a condition | {"?NOT": {"?EQ": {"field": "value"}}} |
Comparison Operators
Operator | Description | Example |
|---|---|---|
| Equal to |
|
| Greater than |
|
| Greater than or equal to |
|
| Less than |
|
| Less than or equal to |
|
| Value is in a list |
|
| Value is within |
|
String Operators
| Operator | Description | Example |
|---|---|---|
?CONTAINS | Contains substring (case-sensitive) | {"?CONTAINS": {"name": "tech"}} |
?ICONTAINS | Contains substring (case-insensitive) | {"?ICONTAINS": {"name": "TeCH"}} |
?STARTSWITH | Starts with value (case-sensitive) | {"?STARTSWITH": {"name": "Nordic"}} |
?ENDSWITH | Ends with value | {"?ENDSWITH": {"domain": ".no"}} |
?STARTSWITH allows searching with either a single string or a list of values.
?STARTSWITH allows searching with either a single string or a list of values.Search companies that use any Salesforce technologies:
{"?STARTSWITH": {"technology_data.name": "Salesforce"}}
Search companies that use any Salesforce or HubSpot technologies:
{"?STARTSWITH": {"technology_data.name": ["Salesforce", "HubSpot"]}}
Special Operators
| Operator | Description | Example |
|---|---|---|
?MATCH | Matches conditions within a subdocument array (e.g., contacts, addresses). All conditions inside must match the same subdocument entry. | See Contact Search |
?EXISTS | Check if a field exists | {"?EXISTS": {"website": true}} |
Examples
Example 2: Software Companies OR Consulting Firms with Revenue 1M-10M
This example demonstrates mixing ?ALL and ?AND for more complex query logic:
payload = {
"query": {
"?ALL": [
{
"?ANY": [
{"?EQ": {"official_industries.code": ["62"]}}, # Software
{"?EQ": {"official_industries.code": ["63"]}}, # IT Services
{"?EQ": {"official_industries.code": ["70"]}} # Consulting
]
},
{"?RANGE": {"financial_data.revenue": [1000000, 10000000}]} # Revenue >= 1M
]
},
"database": "FI",
"fields": ["business_id", "name", "financial_data", "official_industries"],
"limit": 100,
}
response = requests.post(organizations_endpoint, headers=headers, json=payload)
print(response.json())
Example 3: Complex Contact Search
Find companies with C-level contacts matching specific criteria. The ?MATCH operator ensures all conditions apply to the same contact. Without?MATCH it's possible to match to company where one contact has first condition and another contact has second condition.
payload = {
"query": {
"?ALL": [
{"?MATCH": {
"contacts": {
"?ALL": [
{"?CONTAINS": {"titles.title": "CEO"}},
{"?ENDSWITH": {"email": "@company.com"}}
]
}
}}
]
},
"fields": [
"business_id",
"name",
"contacts.full_name",
"contacts.email",
"contacts.titles"
],
"database": "FI",
"limit": 20
}
response = requests.post(organizations_endpoint, headers=headers, json=payload)
print(response.json())Example 4: Location-Based Search with Multiple Conditions
Combine location, industry, and employee count filters. This query finds technology companies with main location from the official registry in Helsinki with more than 50 employees.
payload = {
"query": {
"?ALL": [
# More that 10 employees:
{"?GTE": {"financial_data.employees.absolute_count": 10}},
# Business unit with any type in Helsinki (56111)
{"?MATCH": {
"business_units": {
"?ALL": [
{"?EQ": {"visiting_address.city": "Helsinki"}},
{"?EQ": {"types": "registry_main_location"}}
]
}
}},
# Restaurants and cafeterias (56111)
{"?EQ": {"official_industries.code": "56111"}},
]
},
"fields": [
"business_id",
"name",
"staff_number",
"prospect_addresses.visiting_city",
"website"
],
"database": "FI",
"limit": 20
}
response = requests.post(organizations_endpoint, headers=headers, json=payload)
print(response.json())Example 5: Financial Data Filtering with Sort Order
Query companies by multiple financial metrics and sort results by revenue descending. Use null in a ?RANGE to leave one bound open (e.g., revenue above 1M with no upper limit).
payload = {
"query": {
"?ALL": [
{"?RANGE": {"financial_data.revenue": [1000000, None]}},
{"?GT": {"financial_data.profit": 100000}},
{"?RANGE": {"financial_data.employee_count": [10, 50]}}
]
},
"fields": [
"business_id",
"name",
"financial_data.revenue",
"financial_data.profit",
"financial_data.employee_count",
"financial_statements"
],
"database": "FI",
"order": "-financial_data.revenue",
"limit": 20
}
response = requests.post(organizations_endpoint, headers=headers, json=payload)
print(response.json())Example 6: Finding companies that mention "AI Agents" on their website
website_data.keywords stores n-grams (shingles) of phrases and keywords found on a company's website.
In this context n-grams are single keywords or multi word phrases found in the website. In website_data.keywords, each token can contain up to 4 words. For example, you can search for companies that have "Company Data is Fun" on their website, but you cannot search for "Company Data is Fun and Interesting".
Features of website_data.keywords data point:
website_data.keywords data point:- Always case-insensitive
- Maximum n-gram size is 40 characters
- Maximum n-gram length is 4 words
- Minimum n-gram length is 4 characters (For example "AI" is not included but "AI Agent" is)
- All non-ASCII characters are removed except for
"ä ö å æ ø"
Full example
This example searches for companies in Sweden that mention any of these phrases on their website: "AI Agent", "Autonomous Agent", "Agentic AI", "Agentic Workflows", or "AI Workforce".
In other words any of the n-grams we found int the website starts with these phrases.
payload = {
"query": {
"?ALL": [
{"?STARTSWITH": {"website_data.keywords": ["AI Agent", "Autonomous Agent", "Agentic AI", "Agentic Workflows", "AI Workforce"]}}
]
},
"fields": [
"business_id",
"name",
],
"database": "SE",
"limit": 20
}
response = requests.post(organizations_endpoint, headers=headers, json=payload)
print(response.json())Advanced Features
Vehicle Data Search
Vehicle data fields require special API permissions. Contact Vainu support to enable access.
Find companies with specific vehicle types in their fleet. ?MATCH here ensures that the same vehicle matches all the conditions.
payload = {
"query": {
"?ALL": [
{"?MATCH": {
"vehicles": {
"?ALL": [
{"?EQ": {"vehicle_class": "N1"}},
{"?EQ": {"brand_human_readable": "Volvo"}}
]
}
}}
]
},
"fields": [
"business_id",
"name",
"vehicles"
],
"database": "FI",
"limit": 10
}
response = requests.post(organizations_endpoint, headers=headers, json=payload)
print(response.json())List Filtering
Lists can be created in the Vainu Platform To get all the companies in your vainu list using API with selected datapoints.
payload = {
"list": LIST_ID
"fields": [
"business_id",
"name",
"website"
],
"limit": 20
}
response = requests.post(organizations_endpoint, headers=headers, json=payload)
print(response.json())Include target group as a part of your query:
modifications_range = [datetime.datetime.now()-datetime.timedelta(days=7), datetime.datetime.now()]
payload = {
"query": {
"?ALL": [
{"?IN": {"target_group._id": [YOUR_VAINU_LIST_ID]}},
{
"?ANY": [
{"?RANGE": {"modifications.basic": modifications_range}},
{"?RANGE": {"modifications.financial_statements": modifications_range}},
],
},
]
},
"fields": [
"business_id",
"name",
"website"
],
"database": "FI",
"limit": 20
}
response = requests.post(organizations_endpoint, headers=headers, json=payload)
print(response.json())Exporting Data in Different Formats
Append a format query parameter to the endpoint URL to receive results as CSV, JSONL, or XLSX instead of the default JSON.
# Export as CSV
csv_response = requests.post(
f"{organizations_endpoint}?format=csv",
headers=headers,
json=payload
)
# Export as XLSX
xlsx_response = requests.post(
f"{organizations_endpoint}?format=xlsx",
headers=headers,
json=payload
)
# Export as JSON Lines (one JSON object per line)
jsonl_response = requests.post(
f"{organizations_endpoint}?format=jsonl",
headers=headers,
json=payload
)Important Notes
- Some fields (e.g., vehicle data) require special API permissions.
- Rate limits apply to all endpoints. Avoid excessive concurrent requests.
- Always specify the correct
databasecode (FI,SE,NO,DK,NL) matching the country of the companies you're querying.- For large datasets (10,000+ results), use the async endpoints to avoid request timeouts.
- Contact Vainu support for questions about field availability and permission requirements.