4. Basics of Orange Tool

Mansi Khatri
4 min readAug 27, 2021

--

What is the orange tool?

It is an open-source data visualization, data mining, and machine learning tool. Orange is a scriptable environment for quick prototyping of the latest algorithms and testing patterns. It is a group of python-based modules that exist in the core library.
The objective of Orange is to provide a platform for experiment-based selection, predictive modelling, and recommendation system.

Orange Widgets

Orange widgets give us a graphical user interface to orange’s data mining and machine learning techniques. They incorporate widgets for data entry and preprocessing, classification, regression, association rules and clustering a set of widgets for model assessment and visualization of assessment results, and widgets for exporting the models.

Example
Widgets convey the data by tokens that are passed from the sender to the receiver widget. For example, a file widget outputs the data objects, that can be received by a widget classification tree learner widget. The classification tree builds a classification model that sends the data to the widget that graphically shows the tree.

How to use workflows in orange?

It is very easy to create a workflow in the Orange tool. Let’s understand with a simple example. first, take a file widget and load some data in that file. you can use pre-loaded data or the data from the external file from your device. now to check that data is loaded or not click, make a connection between file and datatable then double click on the data table. you can see the data are in form of rows and columns. now to visualise the data make a connection between file and distribution .so In this way file widget sends data to distribution. in the distribution, we can walk through all the features in the data. We can inspect the data in the scattered plots also.

How to do basic data exploration (like data distribution, data information).

Distributions

Displays value distributions for a single attribute.

  • Inputs
    Data: input dataset
  • Outputs
    Selected Data: instances selected from the plot
    Data: data with an additional column showing whether an instance is selected
    The Distributions widget displays the value distribution of discrete or continuous attributes. If the data contains a class variable, distributions may be conditioned on the class.

The graph shows how many times (e.g., in how many instances) each attribute value appears in the data. If the data contains a class variable, class distributions for each of the attribute values will be displayed.

For continuous attributes, the attribute values are also displayed as a histogram. It is possible to fit various distributions to the data, for example, a Gaussian kernel density estimation. Hide bars hide histogram bars and show only distribution (old behaviour of Distributions).

In class-less domains, the bars are displayed in blue. We used the Housing dataset.

Data information

Displays information on a selected dataset.

  • Inputs
    Data: input dataset

A simple widget that presents information on the dataset size, features, targets, meta attributes, and location.

1. Information on dataset size
2. Information on discrete and continuous features
3. Information on targets
4. Information on meta attributes
5. Information on where the data is stored
6. Produce a report.

How to load your data in Orange and how to load external data from API in Orange?

  • How to load your data in Orange and how to load external data from API in Orange?
  • File Reads attribute-value data from an input file. The widget reads data from Excel (.xlsx), simple-tab-delimited (.txt), comma-separated files (.csv) or URLs.
  • The data is normally a table where data instances are in rows a data attributes are in columns.
  • We just need to create a sharable link of that file and copy and paste it into the file widget.

For more information go to my next blog on visual programming with Orange Tool, Thank You. 😇

--

--