Data Preprocessing with Orange Tool

Preprocessing is a key component in Data Science. The orange tool has various ways to achieve it.

Introduction

In the Orange tool canvas, take the Python script from the left panel and double click on it.

Discretization

import Orange
store = Orange.data.Table(“iris.tab”)
iris = Orange.preprocess.Discretize()
iris.method = Orange.preprocess.discretize.EqualFreq(n=3)
d_store = iris(store)
print(“Original dataset:”)
for e in store[:3]:
print(e)
print(“Discretized dataset:”)
for e in d_store[:3]:
print(e)

Continuization

import Orange
titanic = Orange.data.Table("titanic")
continuizer = Orange.preprocess.Continuize()
titanic1 = continuizer(titanic)

Normalization

from Orange.data import Table
from Orange.preprocess import Normalize
data = Table("iris.tab")
normalizer = Normalize(norm_type=Normalize.NormalizeBySpan)
normalized_data = normalizer(data)

Randomization

class Orange.preprocess.Randomize(rand_type=Randomize.RandomizeClasses, rand_seed=None)

Construct a preprocessor for randomization of classes, attributes and/or metas. Given a data table, preprocessor returns a new table in which the data is shuffled.

from Orange.data import Table
from Orange.preprocess import Randomize
data = Table("iris")
randomizer = Randomize(Randomize.RandomizeClasses)
randomized_data = randomizer(data)

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store