Skip to content

Data Model

The SQLite database (dbn-poc-database.db) contains a single table, msmeloans, with ~11,000 rows and 40 columns.


Table: msmeloans

DDL

CREATE TABLE "msmeloans" (
  "SeqNo"                          REAL,
  "Gender"                         TEXT,
  "State"                          TEXT,
  "Sector"                         TEXT,
  "PFI ID"                         TEXT,
  "FullNames"                      TEXT,
  "BirthDate"                      TEXT,
  "AmountGranted"                  INTEGER,
  "EffectiveDate"                  TEXT,
  "Tenor"                          INTEGER,
  "TenorType"                      TEXT,
  "MaturityDate"                   TEXT,
  "TenorInDays"                    INTEGER,
  "Rate"                           REAL,
  "ScheduleType"                   TEXT,
  "InterestRepayStartDate"         TEXT,
  "PrincipalRepayStartDate"        TEXT,
  "InterestRepayFreq"              TEXT,
  "PrincipalRepayFreq"             TEXT,
  "NatureOfBusiness"               TEXT,
  "FirstTimeAccessToCredit"        TEXT,
  "StartUp"                        TEXT,
  "MSMEAnnualTurnover"             INTEGER,
  "NumberOfEmployees"              INTEGER,
  "Age"                            INTEGER,
  "age_group_id"                   INTEGER,
  "Status"                         INTEGER,
  "DateAdded"                      TEXT,
  "ES_Rating"                      TEXT,
  "green_energy/energy_efficiency" TEXT,
  "FirstTimeAccessToCredit_norm"   INTEGER,
  "StartUp_norm"                   INTEGER,
  "age_group"                      TEXT,
  "loan_to_turnover"               REAL,
  "employees_bucket"               TEXT,
  "synthetic_default"              INTEGER,
  "risk_score"                     REAL,
  "risk_category"                  TEXT,
  "pred_default_prob"              REAL,
  "predicted_default"              INTEGER
);

Column Reference

Borrower Demographics

Column Type Description
SeqNo REAL Sequential record number
Gender TEXT M or F
State TEXT State ID of the borrower
FullNames TEXT Full name of the borrower
BirthDate TEXT Date of birth (YYYY-MM-DD)
Age INTEGER Age in years
age_group TEXT Age bracket (e.g., 36-45)
age_group_id INTEGER Numeric ID for age bracket

Loan Details

Column Type Description
PFI ID TEXT Participating Financial Institution identifier
AmountGranted INTEGER Loan amount in local currency
EffectiveDate TEXT Date the loan was disbursed
MaturityDate TEXT Date the loan matures
Tenor INTEGER Loan duration
TenorType TEXT Unit of tenor (e.g., Months)
TenorInDays INTEGER Tenor expressed in days
Rate REAL Interest rate (%)
ScheduleType TEXT Repayment schedule type
InterestRepayStartDate TEXT First interest repayment date
PrincipalRepayStartDate TEXT First principal repayment date
InterestRepayFreq TEXT Frequency of interest repayment
PrincipalRepayFreq TEXT Frequency of principal repayment
Status INTEGER Loan status code
DateAdded TEXT Record creation date

Business Profile

Column Type Description
Sector TEXT Business sector (e.g., Trade and Commerce)
NatureOfBusiness TEXT Nature of business description
MSMEAnnualTurnover INTEGER Annual business turnover
NumberOfEmployees INTEGER Number of employees
employees_bucket TEXT Categorised employee band (e.g., Solo, Micro)
loan_to_turnover REAL Ratio of loan to annual turnover
FirstTimeAccessToCredit TEXT Whether this is the first credit access
FirstTimeAccessToCredit_norm INTEGER Normalised: 1 = first time, 0 = returning
StartUp TEXT Whether the business is a startup
StartUp_norm INTEGER Normalised: 1 = startup, 0 = established
ES_Rating TEXT Environmental & Social rating
green_energy/energy_efficiency TEXT Green energy flag

Risk & Prediction

Column Type Description
synthetic_default INTEGER Simulated default flag (0 or 1)
risk_score REAL Composite risk score (0.0–1.0)
risk_category TEXT Risk label (e.g., Low Risk, High Risk)
pred_default_prob REAL ML-predicted probability of default (0.0–1.0)
predicted_default INTEGER Binary prediction: 1 = will default

Sample Data

SELECT FullNames, Sector, AmountGranted, pred_default_prob, risk_category
FROM msmeloans
ORDER BY pred_default_prob DESC
LIMIT 5;
FullNames Sector AmountGranted pred_default_prob risk_category
James Okafor Agriculture 8,000,000 0.92 High Risk
Aisha Bello Manufacturing 5,500,000 0.87 High Risk
... ... ... ... ...