Your English writing platform
Discover LudwigExact(8)
Association rules have been widely used for detecting relations between attribute-value pairs of categorical datasets.
In this paper, we present a k-means type clustering algorithm that finds clusters in data subspaces in mixed numeric and categorical datasets.
Thus, the application domain is much different from the one of DAC, that is designed to work on large-scale and large-domain categorical datasets.
We evaluate the proposed algorithm on several categorical datasets and compared it against random initialization and two other initialization methods, and show that the proposed method performs better in terms of accuracy and time complexity.
Nie et al. (2011) suggested that the DT not only produces results which are easy to understand, but that it also has the ability to build models using numerical and categorical datasets.
The Gaussian assumption is also violated for categorical datasets, such as data on mutation types and copy number variation data (Hudson et al., 2010).
Similar(52)
We test our approach on a categorical dataset that is large in size (over 1TB), volume (more than 4 billion records) and domain (800 million distinct values among all the features).
For the combined dataset analysis, a categorical variable indicating dataset was included to avoid bias from interpopulation differences in mean TRF length.
Although many researches have been made for numerical, categorical or mixed datasets, most of them are not very effective or cannot guarantee the unique clustering result.
For example, the number of process parameters (predictors) may be large with respect to the number of samples, the predictors may contain either numerical or categorical values, the datasets may contain missing values and, finally, the relationship among the predictors and product yield may be non-linear.
Most importantly, they perform differently in the presence of small datasets, outliers, categorical factors, and missing values.
Write better and faster with AI suggestions while staying true to your unique style.
Since I tried Ludwig back in 2017, I have been constantly using it in both editing and translation. Ever since, I suggest it to my translators at ProSciEditing.

Justyna Jupowicz-Kozak
CEO of Professional Science Editing for Scientists @ prosciediting.com