Quick start

Adaptive Real Time Machine Learning (artml) is a python library for building real time learning models. In many of the real case scenarios past data is not the true representative for real time predictions, hence we need to update the models continously. artml library helps you to build these real time learning models.

A quick introduction

from artml import bet    # import bet for creating basic element table.
BET =  bet.create_bet(dataframe)
from artml.explore import stats
stats.univariate(BET)
from artml.models import lda
lda = lda.LinearDiscriminantAnalysis()
lda.fit(Traindata, 'target')
y_pred = lda.predict(testdata)
from artml.metrics import scores
scores.accuracy(y_test, y_pred)

For further information and use cases visit artml blog.

Basic Data processing

To begin using artml, first data should be preprocessed so as to build and update the basic element table. Basic preprocessing steps include -

  • Imputing missing data
  • One Hot encoding of categorical values

Functions for above steps should be created so that streaming data can be preprocessed online. artml doesn't include functions for any of these and custom functions should created according to the project.

Installation: clone artml using git

  git clone https://github.com/AdaptiveMachineLearning/artml.git

Building Basic Element Table (BET)

For using artml, basic element table is the key for all the successive steps. After generating BET, this table can be used for Data exploration & for Modeling. To use this in real time, BET can be updated in real time using the real time equation for BET. As, BET is updated with the new Data, we can update the model in real time.

create_bet

artml library has functions for building and updating the BET. Use these functions, for the streaming data to update the BET. Use create_bet function to build the BET. The only field you need to input is Dataframe or Data to build the BET. Output created is a Table in a Pandas dataframe format.

  #creating the basic elements table for the dataset.
  from artml import bet
  BET =  bet.create_bet(DataToCreateBet)

BET need to be created only once at the begininng, later on to make any further changes use learn or forget functions.

learn

After building the BET, to further update BET with the new data use learn function. arguments needed for this function are BET (variable assigned for the created BET) and new dataframe. It returns the new updated BET as a dataframe.

  BET_New = bet.learn(BET, NewData)

forget

Similarly like learn function to update BET with the new data, if we want to remove the effect of some data from the BET we can use forget function. arguments needed for this function are BET (variable assigned for the created BET) and the data that needs to be deleted (Input as a dataframe). It returns the new BET as a dataframe.

  BET_updated = bet.forget(BET, Data)

Real Time Data exploration

After generating BET, we can use it for data exploration to understand the summarized data. Real time Data Exploration can be categorized into two types:

  • Real time Univariate Data Exploration
  • Real time Bivariate Data Exploration
Univariate exploration

Use univariate function to get the summarized univariate statistics for the data. Only argument needed for this function is the BET table for generating the summarized results. It returns a dataframe with all the univariate stats.

  #Importing artml explore module for calculating univariate
  from artml.explore import stats
  stats.univariate(BET)
bivariate exploration

For getting bivariate stats for the data use covariance or correlation functions. Also, for comparing averages of groups whether they are statistically significant or not use Ztest or Ttest functions.

  #Importing artml explore module for calculating univariate
  from artml.explore import stats
  stats.covariance(BET)
  stats.correlation(BET)
  stats.Ztest(BET, 'feature1name','feature2name')

Real Time Learning models

ART-ML technique can be applied for all the linear regression and classification algorithms like MLR, Naïve Bayesian, LDA, PCA etc.,

  from artml.models import naive_bayes
  gnb = GaussianNB()
  gnb.fit(BET, 'Target1name','Target2name')
  y_pred = gnb.predict(TestingData)

artml library is built in such a way to resemble syntaxes similar to sklearn functions. For detailed information about different models refer to Real Time Learning models section.

Real Time Feature selection

After building the real time learning models, artml also gives you the flexibility to configure models in real time by selecting the best features using feature_selection. For the artml feature selection use any of the below functions.

  #Importing feature_selection algorithm from artml library
  from artml.feature_selection import mahalanobis_features
  best_features = mahalanobis_features.mahalanobis_selection()
  features = best_features.forward_selection(BET,'Targetname', alpha=1.1)

arguments for this function includes BET, 'targetname' and alpha value. For detailed information, Look into the further sections.

Model Evaluation

For general model evaluation like accuracy and scores, we can use general sklearn or the builtin functions in the artml (Either of them gives you similar results). Use scores function from artml as shown below

  #Import accuracy_score from artml for finding the accuracy of the model
  from artml.metrics import scores
  scores.accuracy(y_test, y_pred)

Artml also includes extra features for real time model building and data exploration like, (some of them already exists in the current versions and ramaining will be updated later)

  • Pipeline method for real time exploration to model Building
  • Ttest and Ztest Functions
  • Regularization for MLR (Ridge Regression)
  • Backward and forward feature selection techniques