Fraudulent Transaction Prediction Model

Бюджет: 100 $

I am providing a labelled set of historical transaction records and I need a complete, end-to-end solution that pinpoints which future transactions are most likely to be fraudulent. Your work begins with thorough Exploratory Data Analysis and the key statistical checks that reveal distribution quirks, correlations, and outliers; these insights will guide thoughtful feature engineering so that raw fields become powerful fraud-signalling variables. Once the data foundations are solid, build a supervised classification model—feel free to experiment with gradient boosting, random forests, or any other algorithm that offers high recall without sacrificing precision. I am particularly interested in the classic fraud-focused metrics: confusion matrix results, precision-recall curves, ROC-AUC scores, and a clear discussion of false-positive cost. Explain each modelling choice, hyper-parameter decision, and validation step in plain language. Deliverables • A well-commented Jupyter notebook (Python, pandas, scikit-learn or similar) that runs end-to-end: loading data, EDA visuals, feature generation, model training, evaluation, and prediction output • A concise written summary or slide deck highlighting key findings, feature importance, model performance, and recommended next steps • A standalone CSV (or parquet) file containing the probability score for each transaction in the test set, ready for downstream use Acceptance criteria: notebook executes without errors on fresh install, model achieves materially better fraud-detection recall than a naïve benchmark, and all insights are reproducible from the code provided. If you have questions about the dataset structure or want to propose alternative evaluation approaches, include them in your initial message so we can lock them in early.

Python

Регистрация