Used and loved by millions
Since I tried Ludwig back in 2017, I have been constantly using it in both editing and translation. Ever since, I suggest it to my translators at ProSciEditing.

Justyna Jupowicz-Kozak
CEO of Professional Science Editing for Scientists @ prosciediting.com
the dataset includes
Grammar usage guide and real-world examplesUSAGE SUMMARY
The phrase "the dataset includes" is correct and usable in written English.
You can use it when describing the contents or components of a dataset in a research paper, report, or data analysis context. Example: "In our study, the dataset includes various demographic information, such as age, gender, and income level."
✓ Grammatically correct
Science
Academia
News & Media
Alternative expressions(3)
Table of contents
Usage summary
Human-verified examples
Expert writing tips
Linguistic context
Ludwig's wrap-up
Alternative expressions
FAQs
Human-verified examples from authoritative sources
Exact Expressions
60 human-written examples
The dataset includes urban CO2 emissions, GDP, and population.
The dataset includes 11 engineering bioreactor parameters as input variables.
Science
The dataset includes 7 separate days worth of tweets, indexed and ready to upload into our Solr database.
The dataset includes large variability in growing climates, a prerequisite to investigate phenology models for use in climate change applications.
The dataset includes 330 carbonate core plug samples from twelve different study areas and hence includes a highly diverse range of carbonate rock types.
The dataset includes variables such as temperature, salinity, oxygen, phosphate, silicate, phytoplankton and zooplankton community structure and abundance, meteorological conditions, fish and marine mammal counts, and more.
Academia
The dataset includes EEG, eye-tracking, and physiological (GSR and Heart rate) signals collected from 34 individuals (18 able-bodied and 16 motor-impaired).
Science
The dataset includes two-minute ensemble averaged continuous velocity and backscatter profiles, supplemented by spatially gridded maps for each velocity component, error velocity and local bathymetry.
Science
The dataset includes various questions about relationships with neighbours.
Science
The dataset includes Big Five personality annotations obtained by crowdsourcing.
The dataset includes 96.5%% of all patient admissions statewide based on an annual audit of hospitals.
Science
Expert writing Tips
Best practice
When describing the contents of a dataset, be specific and list the key variables or components it includes. For example, instead of saying "the dataset includes various parameters", specify "the dataset includes temperature, salinity, and depth measurements."
Common error
Don't use overly general terms when describing what a dataset includes. Instead of stating "the dataset includes information", be precise by listing specific variables or types of data contained within the dataset. For instance, use "the dataset includes patient demographics, medical history, and treatment outcomes."
Source & Trust
85%
Authority and reliability
4.5/5
Expert rating
Real-world application tested
Linguistic Context
The phrase "the dataset includes" functions as a declarative statement to introduce the contents or components of a dataset. Ludwig AI indicates it is a common and correct phrase. Examples from Ludwig showcase its use across diverse fields, emphasizing the specific elements within a dataset.
Frequent in
Science
79%
Academia
13%
News & Media
5%
Less common in
Formal & Business
1%
Encyclopedias
1%
Wiki
0%
Ludwig's WRAP-UP
In summary, the phrase "the dataset includes" is a grammatically correct and very common way to introduce the contents of a dataset. Ludwig AI affirms its validity and widespread use. Its primary function is to inform readers about the specific variables or components contained within the dataset, commonly used in formal and scientific contexts. When using this phrase, avoid vagueness by listing precise details and key variables. Alternatives like "the dataset contains" or "the dataset comprises" can be used for variety. "The dataset includes" shows a strong prevalence in scientific and academic writing.
More alternative expressions(6)
Phrases that express similar concepts, ordered by semantic similarity:
the dataset contains
Replaces "includes" with "contains", emphasizing the presence of specific elements within the dataset.
the dataset comprises
Uses "comprises" to denote that the dataset is made up of certain elements.
the dataset encompasses
Substitutes "includes" with "encompasses", suggesting a broader and more comprehensive inclusion of data.
the dataset incorporates
Employs "incorporates" to indicate that the dataset integrates various components.
the dataset features
Replaces "includes" with "features", highlighting key or prominent aspects of the dataset.
the dataset provides
Uses "provides" to suggest that the dataset offers specific types of information.
the dataset presents
Substitutes "includes" with "presents", emphasizing the dataset's role in showcasing certain data.
the dataset details
Employs "details" to indicate that the dataset offers specifics about certain elements.
the dataset documents
Replaces "includes" with "documents", highlighting the dataset's role in recording information.
the data set specifies
Uses "specifies" to suggest that the dataset explicitly defines its elements.
FAQs
How can I use "the dataset includes" in a sentence?
You can use "the dataset includes" to introduce the elements or variables contained within a particular dataset. For example, "The dataset includes demographic information, medical history, and treatment outcomes for each patient."
What are some alternatives to "the dataset includes"?
Some alternatives to "the dataset includes" are "the dataset contains", "the dataset comprises", or "the dataset encompasses". The best choice depends on the specific context and the nuance you wish to convey.
Is it better to say "the dataset includes" or "the dataset contains"?
Both "the dataset includes" and "the dataset contains" are grammatically correct and often interchangeable. "Includes" suggests a listing of components, while "contains" emphasizes the presence of those components within the dataset. The choice often comes down to personal preference.
How do I avoid being too vague when using "the dataset includes"?
To avoid vagueness, follow "the dataset includes" with a specific list of variables, features, or types of data contained within the dataset. Instead of saying "the dataset includes information", specify "the dataset includes patient age, gender, diagnosis codes, and treatment dates".
Editing plus AI, all in one place.
Stop switching between tools. Your AI writing partner for everything—polishing proposals, crafting emails, finding the right tone.
Table of contents
Usage summary
Human-verified examples
Expert writing tips
Linguistic context
Ludwig's wrap-up
Alternative expressions
FAQs
Source & Trust
85%
Authority and reliability
4.5/5
Expert rating
Real-world application tested