Part 1: Install Mlflow on Local Machine
Part 2: Train example model and keep in Mlflow Local Machine
Part 3: Expose example api on Local Machine
Part 4: API Transform for Model API
Part 5: Install Mlflow on GKE Cluster with helm
Part 6: Keep Model in Mlflow remote Cluster
Part 7: Serve Model API on Cluster
API Transform
From Part 3 if api server must handle request that user must call like this
curl -X POST -H "Content-Type:application/json" \
--data "{\"dataframe_split\": {\"data\":[[ \
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 0.0, 1.0, 0.0]]}}" \
http://127.0.0.1:1234/invocations
It not easy. Because user never know what is 0.0 …. mean and order. So user need to call like this
curl -X POST -H "Content-Type:application/json" \
--data '{"country": "Russian Federation","timestamp": "Feb 22, 2023 @ 16:59:59.942"}' \
http://127.0.0.1:6789/v2/gateway
It easier right!!!.
Create Gateway Server API
- Create new server.py in some project with train.py and import lib. Actually we should separate file to another project for deploy to another microservices on cluster.
from flask import Flask, request, jsonify
import requests
import json
from waitress import serve
import os
import pandas as pd
from sklearn.compose import make_column_transformer
from sklearn.preprocessing import OneHotEncoder
from sharelib import maskOfficeHour
- Add default env that can change when deploy in cluster with Deployment
app = Flask(__name__)
host = os.environ.get('host_ml', '127.0.0.1')
port = os.environ.get('port_ml', '1234')
gateway_port = os.environ.get('gateway_port_ml', '6789')
- Create function for transform user data
def createDataV2(request_country,request_timestamp):
print(request_country + " " + request_timestamp)
test_df = pd.DataFrame([[request_country,request_timestamp]],columns=['mt.ads_country_dst', '@timestamp'])
test_df = maskOfficeHour(test_df)
test_df = test_df.drop(['@timestamp'], axis=1)
X_new = X_transform.transform(test_df)
data = {
"data":
X_new.toarray().tolist()
}
return data
- Create API path that will convert data and send to invocations
@app.route('/v2/gateway', methods=['POST'])
def get_invocationsV2():
headers = {
"Content-Type": "application/json",
}
content = request.json
request_country = content['country']
request_timestamp = content['timestamp']
content_data = createDataV2(request_country,request_timestamp)
try:
resp = requests.post(
url="http://%s:%s/invocations" % (host, port),
data=json.dumps({"dataframe_split": content_data}),
headers=headers,
)
print(resp.status_code)
return resp.json()
except Exception as e:
errmsg = "Caught exception attempting to call model endpoint: %s" % e
print(errmsg, end="")
return resp.json()
- Create main that load Dataset and create column_transformer pattern
This is so importance for use same dataset to create column_transformer to fit dataframe when training model and when convert data in api transform
if __name__ == '__main__':
df = pd.read_csv("data/firewall-traffic.csv")
df_country = df["mt.ads_country_dst"]
df_OfficeHour = maskOfficeHour(df)
df_categories = pd.concat([df_country, df_OfficeHour['is_OfficeHour']], axis=1, sort=False,)
enc = OneHotEncoder(handle_unknown='ignore')
X_transform = make_column_transformer((enc,['mt.ads_country_dst']),(enc,['is_OfficeHour']))
X_transform.fit(df_categories)
print("Server Ready On Port " + gateway_port)
serve(app, host="0.0.0.0", port=gateway_port)
Test API Transform
- Start Server.py
python server.py
- Test call
curl -X POST -H "Content-Type:application/json" \
--data '{"country": "Russian Federation","timestamp": "Feb 22, 2023 @ 16:59:59.942"}' \
http://127.0.0.1:6789/v2/gateway | jq
- Let try more for more traffic
while true;do curl -X POST -H "Content-Type:application/json" \
--data '{"country": "Russian Federation","timestamp": "Feb 22, 2023 @ 16:59:59.942"}' \
http://127.0.0.1:6789/v2/gateway;done
- Have fun!!!
Note for my idea this API Transform must serve near real time traffice from another system. So it need cache system.
— — — — — — — — — — — — — — — — — — — — — — — — — — — — —
Credit : TrueDigitalGroup
— — — — — — — — — — — — — — — — — — — — — — — — — — — — —