Sid pfp
Sid

@sidshekhar

AI models today are evaluated on college exam questions, not real world tasks. from our real world trials, models differ vastly in their performance on real world tasks. specifically: - tool calling - parsing and understanding data effectively - executing actions (via APIs and SDKs) some of the vertical-specific tasks we've used to evaluate ai models while building our ai wallet gina: - sending a transaction - swap from one asset into multiple assets - execute cross-chain swaps - fetch and analyze historical price data for multiple assets
4 replies
10 recasts
21 reactions