Quick start
Adaptive Real Time Machine Learning (artml) is a python library for building real time learning models. In many of the real case scenarios past data is not the true representative for real time predictions, hence we need to update the models continously. artml library helps you to build these real time learning models.
A quick introduction
from artml import bet # import bet for creating basic element table.
BET = bet.create_bet(dataframe)
from artml.explore import stats
stats.univariate(BET)
from artml.models import lda
lda = lda.LinearDiscriminantAnalysis()
lda.fit(Traindata, 'target')
y_pred = lda.predict(testdata)
from artml.metrics import scores
scores.accuracy(y_test, y_pred)
For further information and use cases visit artml blog.
Basic Data processing
To begin using artml, first data should be preprocessed so as to build and update the basic element table. Basic preprocessing steps include -
- Imputing missing data
- One Hot encoding of categorical values
Functions for above steps should be created so that streaming data can be preprocessed online. artml doesn't include functions for any of these and custom functions should created according to the project.
Installation: clone artml
using git
git clone https://github.com/AdaptiveMachineLearning/artml.git
Building Basic Element Table (BET)
For using artml, basic element table is the key for all the successive steps. After generating BET, this table can be used for Data exploration & for Modeling. To use this in real time, BET can be updated in real time using the real time equation for BET. As, BET is updated with the new Data, we can update the model in real time.
create_bet
artml library has functions for building and updating the BET. Use these functions, for the streaming data to update the BET. Use create_bet
function to build the BET. The only field you need to input is Dataframe or Data to build the BET. Output created is a Table in a Pandas dataframe format.
#creating the basic elements table for the dataset.
from artml import bet
BET = bet.create_bet(DataToCreateBet)
BET need to be created only once at the begininng, later on to make any further changes use learn
or forget
functions.
learn
After building the BET, to further update BET with the new data use learn
function. arguments needed for this function are BET (variable assigned for the created BET) and new dataframe. It returns the new updated BET as a dataframe.
BET_New = bet.learn(BET, NewData)
forget
Similarly like learn function to update BET with the new data, if we want to remove the effect of some data from the BET we can use forget
function. arguments needed for this function are BET (variable assigned for the created BET) and the data that needs to be deleted (Input as a dataframe). It returns the new BET as a dataframe.
BET_updated = bet.forget(BET, Data)
Real Time Data exploration
After generating BET, we can use it for data exploration to understand the summarized data. Real time Data Exploration can be categorized into two types:
- Real time Univariate Data Exploration
- Real time Bivariate Data Exploration
Univariate exploration
Use univariate
function to get the summarized univariate statistics for the data. Only argument needed for this function is the BET table for generating the summarized results. It returns a dataframe with all the univariate stats.
#Importing artml explore module for calculating univariate
from artml.explore import stats
stats.univariate(BET)
bivariate exploration
For getting bivariate stats for the data use covariance
or correlation
functions. Also, for comparing averages of groups whether they are statistically significant or not use Ztest
or Ttest
functions.
#Importing artml explore module for calculating univariate
from artml.explore import stats
stats.covariance(BET)
stats.correlation(BET)
stats.Ztest(BET, 'feature1name','feature2name')
Real Time Learning models
ART-ML technique can be applied for all the linear regression and classification algorithms like MLR, Naïve Bayesian, LDA, PCA etc.,
from artml.models import naive_bayes
gnb = GaussianNB()
gnb.fit(BET, 'Target1name','Target2name')
y_pred = gnb.predict(TestingData)
artml library is built in such a way to resemble syntaxes similar to sklearn functions. For detailed information about different models refer to Real Time Learning models
section.
Real Time Feature selection
After building the real time learning models, artml also gives you the flexibility to configure models in real time by selecting the best features using feature_selection
. For the artml feature selection use any of the below functions.
#Importing feature_selection algorithm from artml library
from artml.feature_selection import mahalanobis_features
best_features = mahalanobis_features.mahalanobis_selection()
features = best_features.forward_selection(BET,'Targetname', alpha=1.1)
arguments for this function includes BET, 'targetname' and alpha value. For detailed information, Look into the further sections.
Model Evaluation
For general model evaluation like accuracy and scores, we can use general sklearn or the builtin functions in the artml (Either of them gives you similar results). Use scores
function from artml as shown below
#Import accuracy_score from artml for finding the accuracy of the model
from artml.metrics import scores
scores.accuracy(y_test, y_pred)
Artml also includes extra features for real time model building and data exploration like, (some of them already exists in the current versions and ramaining will be updated later)
- Pipeline method for real time exploration to model Building
- Ttest and Ztest Functions
- Regularization for MLR (Ridge Regression)
- Backward and forward feature selection techniques