Explore if A relates to B. For example, height and weight are related; taller people tend to be heavier than shorter people. Upload your data and A + Ω will find these for you.
This Recipe can give you information about possible correlations between columns in your dataset and how those data are correlated: Do some numbers increase while others decrease? Are certain data proportional to others? Is there direct, neutral or inverse correlation between items in your dataset?
You need to upload an excel or csv file with numeric or / and categorical data (those that can be divided into specific groups, such as race, country, color, type of food, variety, sport, etc). Click here to see an example.
Meet the author
Alpha + Omega Team
Bio
Alpha + Omega team includes Data Scientists, Software Engineers, Journalists and Data geeks!
We strive to create a unified platform that everyone can use with a little effort, in order to create awesome
data projects! Read more here.
Relevant recipes
Fraud Detection for Political Science
Detect fraud patterns on election datasets #fraud-detection
A + Ω platform gave me the opportunity to sell and market my analysis. It
took only a few hours for the team to integrate my code and start promoting
my recipe.
Elena Chatziapostolou
Data Scientist
Are you a data scientist / statistician / algorithm
enthusiast?
Benefit from A + Ω
1. Create a profile
2. Increase your customer base
3. Integrate and sell your analysis methods via our platform
In order to investigate a possible story, upload data and specify what you want to predict. A + Ω will try
to identify patterns. Ask A + Ω questions and it will respond back with it's predictions!
Example:
A journalist wants to explore an alternate election prediction model, using
various economic and political indicators instead of polling data-- and also
deal with the challenges of model building when there is very little training data.
let's suppose that the journalist uses data sources and creates a csv from:
Historical presidential approval ratings (highest and lowest for each president)
from Wikipedia
GDP growth in election year from World Bank
*Numbers on the above table are given as an example of how the A + Ω works, do not
correspond to the real percentage result, election forecast of Year GDP growth.
Year
Current president
Name of political party
Number of terms
Highest Approval Forecast (%)
Lowest Approval Forecast (%)
GDP per capita
Winning party
1936
Roosevelt
Democrat
1
83%
48%
8.67
Democrat
1940
Roosevelt
Democrat
2
83%
48%
10.02
Democrat
1944
Roosevelt
Democrat
3
83%
48%
16.91
Democrat
1948
Truman
Democrat
1
22%
22%
14.45
Democrat
1952
Truman
Democrat
2
22%
22%
16.69
Republican
1956
Eisenhower
Republican
1
79%
47%
17.47
Republican
1960
Eisenhower
Republican
2
79%
47%
17.76
Democrat
1964
Johnson
Democrat
1
79%
34%
20.56
Democrat
1968
Johnson
Democrat
2
79%
34%
24.04
Republican
1972
Nixon
Republican
1
66%
24%
26.12
Republican
1976
Nixon
Republican
2
66%
24%
27.44
Democrat
1980
Carter
Democrat
1
74%
28%
29.86
Republican
1984
Reagan
Republican
1
71%
35%
32.73
Republican
1988
Reagan
Republican
2
71%
35%
36.7
Republican
1992
Bush
Republican
1
89%
29%
38.1
Democrat
1996
Clinton
Democrat
1
73%
37%
41.39
Democrat
2000
Clinton
Democrat
2
73%
37%
46.94
Republican
2004
Bush
Republican
1
90%
25%
49.69
Republican
2008
Bush
Republican
2
90%
25%
50.2
Democrat
2012
Obama
Democrat
1
69%
38%
51.56
Democrat
2016
Obama
Democrat
2
69%
38%
55.00
Republican
2
The app informs the user to select
a column of interest. For example, "Winning party". The questions the app will answer is:
Can we predict the value of this column, by taking into account all other data?
3
User selects the "Winning Party" column. The app will try
to predict the winning party based on the other columns
4
User selects to run the analysis, and waits for the magic to happen. AlphA + omega will try to make sense of your data, identify patterns and create a prediction model.
5
AlphA + Omega is now ready to predict the "Winner". You are asked to provide new data.
Number of terms
% result in last election
Election forecast (%)
Year GDP growth
To be completed by journalist
To be completed by journalist
To be completed by journalist
To be completed by journalist
6
And click an option "Make a prediction for "Winner""
In order to investigate a possible story, explore if A relates to B. For example, height and weight are related; taller people tend to be
heavier than shorter people. Upload your data and A + Ω will find these for you.
Example:
Let's say a journalist wants to explore if demographic and economic factors affect the University Success rating of teenagers.
She has collected the following data:
*Numbers on the above table are given as an example of how the A + Ω works, do not
correspond to the real percentage result, election forecast of Year GDP growth.
Family income per year
Neighborhood
Average number of children per family
...
University success rating
20.000€
Neighborhood 1
27
5%
25.000€
Neighborhood 2
17
17%
...
30.000€
Neighborhood 3
11
25%
2
The app informs the user that it will try
to identify interesting correlations between any of the selected columns
3
User selects to run the analysis. This may take up to 10 minutes.
4
The following can happen when the analysis completes:
The application displays a message that
We cannot find interesting correlations with this dataset.
The application displays a message
We have identified 2 interesting correlations
If the user opens the project to view it, they should be able to view the data and the correlation results
It seems like "Family income per year" is correlated with "University success rating"
The more "Family income per year" increases, the more "University success rating" increases.
It seems like "Average number of children per family" is correlated with "University success rating"
The more "Average number of children per family" increases, the more "University success rating" decreases.
How it works: Outlier Detection
1
In order to investigate a possible story, upload your data to see if there is an outlier
Example:
A journalist has collected some information regarding the perception of government
corruption across various countries. They would like to understand if extreme observations/cases
exist in this data that would be interesting to focus on.
Country
Percentage of population that has a BSc degree
GDP
Percentage of people who believe that the government is corrupted
Country 1
20%
28.000€
50%
Country 2
20%
32.000€
20%
Country 3
22%
29.000€
25%
...
Country 200
21%
30.000€
22%
2
The app informs the user that it will try
to identify extreme observations between any of the selected columns
3
User selects to run the analysis. This may take up to 10 minutes.
4
The following can happen when the analysis completes:
The application displays a message that
We cannot find extreme observations with this dataset.
The application displays a message
We have identified 1 outlier
Country 1 seems to be an outlier (based on the "percentage of people who believe that the government is corrupted")
The application will highlight the outliers for the Journalist.
Country
Percentage of population that has a BSc degree
GDP
Percentage of people who believe that the government is corrupted
Country 1
20%
28.000€
50%
How it works: Fraud Detection for Political Science
1
The algorithm uses leading digits of each input and checks if those numbers follow a
specific distribution named Benford law as known as ‘distribution of distributions’.
The idea behind this statistical theory depends on the assumption that the frequency of
occurrence of the leading digits in a numerical distribution is predictable, nonuniform
but closer to a power-law distribution. In simple words a given number is six times more
likely to start with 1 than 9!
The dataset should contain at least 100 numerical numbers at different intervals.
To be more specific it should include data from 1 to 9, 10 to 99, 100 to 999, and
1000 to 9999. Needless to say that the algorithm does not need analogy in those
numbers but at least one occurrence of each margin.
Candidate A
7676
1262
2068
8986
476
6029
739
2447
1621
50137
In order to test the recipe, Try downloading the following test datasets:
The app informs the user that it will try
to identify fraud in the selected column
3
User selects to run the analysis. This may take up to 10 minutes.
4
The following can happen when the analysis completes:
The application displays a message that
We cannot detect fraud with this dataset.
The application displays a message
We have identified fraud in the dataset
Request Recipe Usage
How does it work?
Create a project, let the system do its magic and tell
the story. As simple as one-two-three!
UPLOAD YOUR DATA
Upload your spreadsheet and navigate through a broad range of
pre-configured, ready to use recipes.
EMPLOY A.I.
Find correlations, outliers, build prediction models without having any
knowledge of statistics, machine learning or programming.
TELL THE STORY
Structure and observe data correlations, taking heed of common pitfalls. Identify
what to research.
Investigate and narrate about a causal model of the world.