Welcome to CrossPredict’s documentation!¶
The library makes cross validation and reports generation easy
Easy to extend to other models
Supports Lightgbm, XGBoost
Supports different crossvalidation strategies
Supports crossvalidation by users (RepeatedKFold)
Supports stratified crossvalidation by target column (RepeatedStratifiedKFold)
Supports simple crossvalidation (RepeatedKFold)
Easy use of target encoding with double crossvalidation
Supports target encoding library category_encoders
Table of contents¶
Installation¶
python -m pip install crosspredict
Reports Preview¶
#create report object
a = ReportBinary()
a.plot_report(
df,
report_shape = (5,2),
report={'Roc-Auc': {'loc':(0, 0)},
'Precision-Recall': [{'loc':(0, 1)}],
'MeanTarget-by-Probability': [{'loc':(1, 0)},{'loc':(1, 1)}],
'Gini-by-Generations': {'loc':(2,0), 'colspan':2},
'MeanTarget-by-Generations': {'loc':(3,0), 'colspan':2},
'Probability-Distribution': [{'loc':(4,0)},{'loc':(4,1)}]},
cols_score = ['result_egr_to_one','probability'],
cols_target = ['target','target2'],
col_generation_deals='first_dt_no_comm_mon'
)
a.fig.savefig('report1.png')
a.plot_report(report_shape = (4,2),
report={'Roc-Auc': {'loc':(0, 0)},
'Precision-Recall': [{'loc':(0, 1)}],
'MeanTarget-by-Probability': [{'loc':(1, 0)}],
'Gini-by-Generations': {'loc':(2,0), 'colspan':2},
'MeanTarget-by-Generations': {'loc':(3,0), 'colspan':2},
'Probability-Distribution': [{'loc':(1,1)}]},
cols_score=['probability'])
a.fig.savefig('report2.png')
Target Encoding with DoubleCrossValidation¶
# creates iterator
iter_df = Iterator(n_repeats=3,
n_splits=10,
random_state = 0,
col_client = 'userid',
cv_byclient=True)
# fits target encoder (creates mappings for each fold)
cross_encoder = CrossTargetEncoder(iterator = iter_df,
encoder_class=WOEEncoder,
n_splits= 5,
n_repeats= 3,
random_state= 0,
col_client= 'userid',
cv_byclient= True,
col_encoded= 'goal1',
cols= ['field3','field2','field11','field23','field18','field20']
)
cross_encoder.fit(train)
# train cross validation models
model_class = CrossLightgbmModel(iterator=iter_df,
feature_name=feature_name,
params=params,
cols_cat = ['field3', 'field2', 'field11', 'field23', 'field18', 'field20'],
num_boost_round = 9999,
early_stopping_rounds = 50,
valid = True,
random_state = 0,
col_target = 'goal1',
cross_target_encoder = cross_encoder)
result = model_class.fit(train)
How to use¶
Plot_Reports_for_Binary_Classification_problem_example - Plot_Reports_for_Binary_Classification_problem_example.ipynb
Simple_example_in_one_Notebook - Simple_example_in_one_Notebook.ipynb
Iterator_class - Iterator_class.ipynb
CrossModelFabric_class - CrossModelFabric_class.ipynb
CrossTargetEncoder_class - CrossTargetEncoder_class.ipynb