Table 3. The results measured the time taken from the user's input of the two models until the model runs and outputs a response.

LLMs	Measured Time
Inference Engine: Phi-3.5 (3.82GB)	PC	Responding from user’s question: 10,809msEncoding user’s input: 10msDecoding LLM’s output: 13ms
Inference Engine: Phi-3.5 (3.82GB)	Laptop	Responding from user’s question: 13,358msEncoding user’s input: 12msDecoding LLM’s output: 13ms
OpenAI API: GPT-4o	PC	Responding from user’s question: 913ms“openai-processing-ms”: 354ms
	Laptop (LAN)	Responding from user’s question: 990ms“openai-processing-ms”: 343ms
	Laptop (Wi-Fi)	Responding from user’s question: 1,498ms“openai-processing-ms”: 380ms