@nt
honestly, so much easier these days to build your own internal tools rather than configuring something off the shelf.
example: AI-native agent eval runner/test harness. there are dozens of off-the-shelf products but I was able to write one in ~30 minutes that does exactly what I want and that I can easily extend in whichever direction we need