The best Large Language Models of November 2024

The TIMETOACT GROUP LLM Benchmarks highlight the most powerful AI language models for digital product development. Discover which large language models performed best in november.

November saw a lot of changes in TOP-10 of our LLM Benchmark. It also saw a few changes in how we build LLM-driven products. Let’s get started.

Update: Claude Sonnet 3.5 v2 - Small capability improvement and great PDF capability
GPT-4o from November 20 - TOP 3!
Qwen 2.5 Coder 32B Instruct - mediocre but pushes SotA!
Qwen QwQ 32B Preview - too smart for its own good
Gemini Experimental 1121 - decent, but hard to get
Plans for LLM Benchmarks v2 - focus on cases and capabilities
Text-to-SQL Benchmark

LLM Benchmarks | November 2024

The benchmarks evaluate the models in terms of their suitability for digital product development. The higher the score, the better.

☁️ - Cloud models with proprietary license
✅ - Open source models that can be run locally without restrictions
🦙 - Local models with Llama license

Can the model generate code and help with programming?

The estimated cost of running the workload. For cloud-based models, we calculate the cost according to the pricing. For on-premises models, we estimate the cost based on GPU requirements for each model, GPU rental cost, model speed, and operational overhead.

How well does the model support work with product catalogs and marketplaces?

How well can the model work with large documents and knowledge bases?

Can the model easily interact with external APIs, services and plugins?

How well can the model support marketing activities, e.g. brainstorming, idea generation and text generation?

How well can the model reason and draw conclusions in a given context?

The "Speed" column indicates the estimated speed of the model in requests per second (without batching). The higher the speed, the better.

Hide Cost

Model	Code	Crm	Docs	Integrate	Marketing	Reason	Final	cost	Speed
1. GPT o1-preview v1/2024-09-12 ☁️	95	92	94	95	88	87	92	52.32 €	0.08 rps
2. GPT o1-mini v1/2024-09-12 ☁️	93	96	94	83	82	87	89	8.15 €	0.16 rps
3. GPT-4o v3/2024-11-20 ☁️	86	97	94	95	88	72	89	0.63 €	1.14 rps
4. GPT-4o v1/2024-05-13 ☁️	90	96	100	92	78	74	88	1.21 €	1.44 rps
5. Google Gemini 1.5 Pro v2 ☁️	86	97	94	99	78	74	88	1.00 €	1.18 rps
6. GPT-4 Turbo v5/2024-04-09 ☁️	86	99	98	96	88	43	85	2.45 €	0.84 rps
7. Google Gemini Exp 1121 ☁️	70	97	97	95	72	72	84	0.89 €	0.49 rps
8. GPT-4o v2/2024-08-06 ☁️	90	84	97	86	82	59	83	0.63 €	1.49 rps
9. Google Gemini 1.5 Pro 0801 ☁️	84	92	79	100	70	74	83	0.90 €	0.83 rps
10. Qwen 2.5 72B Instruct ⚠️	79	92	94	97	71	59	82	0.10 €	0.66 rps
11. Llama 3.1 405B Hermes 3🦙	68	93	89	98	88	53	81	0.54 €	0.49 rps
12. Claude 3.5 Sonnet v2 ☁️	82	97	93	84	71	57	81	0.95 €	0.09 rps
13. GPT-4 v1/0314 ☁️	90	88	98	73	88	45	80	7.04 €	1.31 rps
14. X-AI Grok 2 ⚠️	63	93	87	90	88	58	80	1.03 €	0.31 rps
15. GPT-4 v2/0613 ☁️	90	83	95	73	88	45	79	7.04 €	2.16 rps
16. GPT-4o Mini ☁️	63	87	80	73	100	65	78	0.04 €	1.46 rps
17. Claude 3.5 Sonnet v1 ☁️	72	83	89	87	80	58	78	0.94 €	0.09 rps
18. Claude 3 Opus ☁️	69	88	100	74	76	58	77	4.69 €	0.41 rps
19. Meta Llama3.1 405B Instruct🦙	81	93	92	75	75	48	77	2.39 €	1.16 rps
20. GPT-4 Turbo v4/0125-preview ☁️	66	97	100	83	75	43	77	2.45 €	0.84 rps
21. Google LearnLM 1.5 Pro Experimental ⚠️	48	97	85	96	64	72	77	0.31 €	0.83 rps
22. GPT-4 Turbo v3/1106-preview ☁️	66	75	98	73	88	60	76	2.46 €	0.68 rps
23. Qwen 2.5 32B Coder Instruct ⚠️	43	94	98	98	76	46	76	0.05 €	0.82 rps
24. DeepSeek v2.5 236B ⚠️	57	80	91	80	88	57	75	0.03 €	0.42 rps
25. Meta Llama 3.1 70B Instruct f16🦙	74	89	90	75	75	48	75	1.79 €	0.90 rps
26. Google Gemini 1.5 Flash v2 ☁️	64	96	89	76	81	44	75	0.06 €	2.01 rps
27. Google Gemini 1.5 Pro 0409 ☁️	68	97	96	80	75	26	74	0.95 €	0.59 rps
28. Meta Llama 3 70B Instruct🦙	81	83	84	67	81	45	73	0.06 €	0.85 rps
29. GPT-3.5 v2/0613 ☁️	68	81	73	87	81	50	73	0.34 €	1.46 rps
30. Mistral Large 123B v2/2407 ☁️	68	79	68	75	75	70	72	0.57 €	1.02 rps
31. Google Gemini Flash 1.5 8B ☁️	70	93	78	67	76	48	72	0.01 €	1.19 rps
32. Google Gemini 1.5 Pro 0514 ☁️	73	96	79	100	25	60	72	1.07 €	0.92 rps
33. Google Gemini 1.5 Flash 0514 ☁️	32	97	100	76	72	52	72	0.06 €	1.77 rps
34. Google Gemini 1.0 Pro ☁️	66	86	83	79	88	28	71	0.37 €	1.36 rps
35. Meta Llama 3.2 90B Vision🦙	74	84	87	77	71	32	71	0.23 €	1.10 rps
36. GPT-3.5 v3/1106 ☁️	68	70	71	81	78	58	71	0.24 €	2.33 rps
37. Claude 3.5 Haiku ☁️	52	80	72	75	75	68	70	0.32 €	1.24 rps
38. GPT-3.5 v4/0125 ☁️	63	87	71	77	78	43	70	0.12 €	1.43 rps
39. Cohere Command R+ ☁️	63	80	76	72	70	58	70	0.83 €	1.90 rps
40. Mistral Large 123B v3/2411 ☁️	68	75	64	76	82	51	70	0.56 €	0.66 rps
41. Qwen1.5 32B Chat f16 ⚠️	70	90	82	76	78	20	69	0.97 €	1.66 rps
42. Gemma 2 27B IT ⚠️	61	72	87	74	89	32	69	0.07 €	0.90 rps
43. Mistral 7B OpenChat-3.5 v3 0106 f16 ✅	68	87	67	74	88	25	68	0.32 €	3.39 rps
44. Meta Llama 3 8B Instruct f16🦙	79	62	68	70	80	41	67	0.32 €	3.33 rps
45. Gemma 7B OpenChat-3.5 v3 0106 f16 ✅	63	67	84	58	81	46	67	0.21 €	5.09 rps
46. GPT-3.5-instruct 0914 ☁️	47	92	69	69	88	33	66	0.35 €	2.15 rps
47. GPT-3.5 v1/0301 ☁️	55	82	69	81	82	26	66	0.35 €	4.12 rps
48. Llama 3 8B OpenChat-3.6 20240522 f16 ✅	76	51	76	65	88	38	66	0.28 €	3.79 rps
49. Mistral 7B OpenChat-3.5 v1 f16 ✅	58	72	72	71	88	33	66	0.49 €	2.20 rps
50. Mistral 7B OpenChat-3.5 v2 1210 f16 ✅	63	73	72	66	88	30	65	0.32 €	3.40 rps
51. Qwen 2.5 7B Instruct ⚠️	48	77	80	68	69	47	65	0.07 €	1.25 rps
52. Starling 7B-alpha f16 ⚠️	58	66	67	73	88	34	64	0.58 €	1.85 rps
53. Mistral Nemo 12B v1/2407 ☁️	54	58	51	99	75	49	64	0.03 €	1.22 rps
54. Meta Llama 3.2 11B Vision🦙	70	71	65	70	71	36	64	0.04 €	1.49 rps
55. Llama 3 8B Hermes 2 Theta🦙	61	73	74	74	85	16	64	0.05 €	0.55 rps
56. Claude 3 Haiku ☁️	64	69	64	75	75	35	64	0.08 €	0.52 rps
57. Yi 1.5 34B Chat f16 ⚠️	47	78	70	74	86	26	64	1.18 €	1.37 rps
58. Liquid: LFM 40B MoE ⚠️	72	69	65	63	82	24	63	0.00 €	1.45 rps
59. Meta Llama 3.1 8B Instruct f16🦙	57	74	62	74	74	32	62	0.45 €	2.41 rps
60. Qwen2 7B Instruct f32 ⚠️	50	81	81	61	66	31	62	0.46 €	2.36 rps
61. Claude 3 Sonnet ☁️	72	41	74	74	78	28	61	0.95 €	0.85 rps
62. Mistral Small v3/2409 ☁️	43	75	71	74	75	26	61	0.06 €	0.81 rps
63. Mistral Pixtral 12B ✅	53	69	73	63	64	40	60	0.03 €	0.83 rps
64. Mixtral 8x22B API (Instruct) ☁️	53	62	62	97	75	7	59	0.17 €	3.12 rps
65. Anthropic Claude Instant v1.2 ☁️	58	75	65	77	65	16	59	2.10 €	1.49 rps
66. Codestral Mamba 7B v1 ✅	53	66	51	97	71	17	59	0.30 €	2.82 rps
67. Inflection 3 Productivity ⚠️	46	59	39	70	79	61	59	0.92 €	0.17 rps
68. Anthropic Claude v2.0 ☁️	63	52	55	67	84	34	59	2.19 €	0.40 rps
69. Cohere Command R ☁️	45	66	57	74	84	27	59	0.13 €	2.50 rps
70. Qwen1.5 7B Chat f16 ⚠️	56	81	60	56	60	36	58	0.29 €	3.76 rps
71. Mistral Large v1/2402 ☁️	37	49	70	83	84	25	58	0.58 €	2.11 rps
72. Microsoft WizardLM 2 8x22B ⚠️	48	76	79	59	62	22	58	0.13 €	0.70 rps
73. Qwen1.5 14B Chat f16 ⚠️	50	58	51	72	84	22	56	0.36 €	3.03 rps
74. MistralAI Ministral 8B ✅	56	55	41	82	68	30	55	0.02 €	1.02 rps
75. Anthropic Claude v2.1 ☁️	29	58	59	78	75	32	55	2.25 €	0.35 rps
76. Mistral 7B OpenOrca f16 ☁️	54	57	76	36	78	27	55	0.41 €	2.65 rps
77. MistralAI Ministral 3B ✅	50	48	39	89	60	41	54	0.01 €	1.02 rps
78. Llama2 13B Vicuna-1.5 f16🦙	50	37	55	62	82	37	54	0.99 €	1.09 rps
79. Mistral 7B Instruct v0.1 f16 ☁️	34	71	69	63	62	23	54	0.75 €	1.43 rps
80. Meta Llama 3.2 3B🦙	52	71	66	71	44	14	53	0.01 €	1.25 rps
81. Google Recurrent Gemma 9B IT f16 ⚠️	58	27	71	64	56	23	50	0.89 €	1.21 rps
82. Codestral 22B v1 ✅	38	47	44	84	66	13	49	0.06 €	4.03 rps
83. Qwen: QwQ 32B Preview ⚠️	43	32	74	52	48	40	48	0.05 €	0.63 rps
84. Llama2 13B Hermes f16🦙	50	24	37	75	60	42	48	1.00 €	1.07 rps
85. IBM Granite 34B Code Instruct f16 ☁️	63	49	34	67	57	7	46	1.07 €	1.51 rps
86. Meta Llama 3.2 1B🦙	32	40	33	53	68	51	46	0.02 €	1.69 rps
87. Mistral Small v2/2402 ☁️	33	42	45	88	56	8	46	0.06 €	3.21 rps
88. Mistral Small v1/2312 (Mixtral) ☁️	10	67	63	65	56	8	45	0.06 €	2.21 rps
89. DBRX 132B Instruct ⚠️	43	39	43	74	59	10	45	0.26 €	1.31 rps
90. NVIDIA Llama 3.1 Nemotron 70B Instruct🦙	68	54	25	72	28	21	45	0.09 €	0.53 rps
91. Mistral Medium v1/2312 ☁️	41	43	44	59	62	12	44	0.81 €	0.35 rps
92. Microsoft WizardLM 2 7B ⚠️	53	34	42	66	53	13	43	0.02 €	0.89 rps
93. Llama2 13B Puffin f16🦙	37	15	44	67	56	39	43	4.70 €	0.23 rps
94. Mistral Tiny v1/2312 (7B Instruct v0.2) ☁️	22	47	59	53	62	8	42	0.05 €	2.39 rps
95. Gemma 2 9B IT ⚠️	45	25	47	36	68	13	39	0.02 €	0.88 rps
96. Meta Llama2 13B chat f16🦙	22	38	17	65	75	6	37	0.75 €	1.44 rps
97. Mistral 7B Zephyr-β f16 ✅	37	34	46	62	29	4	35	0.46 €	2.34 rps
98. Meta Llama2 7B chat f16🦙	22	33	20	62	50	18	34	0.56 €	1.93 rps
99. Mistral 7B Notus-v1 f16 ⚠️	10	54	25	60	48	4	33	0.75 €	1.43 rps
100. Orca 2 13B f16 ⚠️	18	22	32	29	67	20	31	0.95 €	1.14 rps
101. Mistral 7B Instruct v0.2 f16 ☁️	11	30	54	25	58	8	31	0.96 €	1.12 rps
102. Mistral 7B v0.1 f16 ☁️	0	9	48	63	52	12	31	0.87 €	1.23 rps
103. Google Gemma 2B IT f16 ⚠️	33	28	16	47	15	20	27	0.30 €	3.54 rps
104. Microsoft Phi 3 Medium 4K Instruct 14B f16 ⚠️	5	34	30	32	47	8	26	0.82 €	1.32 rps
105. Orca 2 7B f16 ⚠️	22	0	26	26	52	4	22	0.78 €	1.38 rps
106. Google Gemma 7B IT f16 ⚠️	0	0	0	6	62	0	11	0.99 €	1.08 rps
107. Meta Llama2 7B f16🦙	0	5	22	3	28	2	10	0.95 €	1.13 rps
108. Yi 1.5 9B Chat f16 ⚠️	0	4	29	17	0	8	10	1.41 €	0.76 rps

Claude 3.5 v2 Update and document extraction in manufacturing

In the october LLM Benchmark we’ve written that Anthropic has achieved a small improvement in the capabilities of its Claude 3.5 v2. That improvement is relatively small, but not enough to put it into the TOP-10.

Yet, this Anthropic Claude 3.5 Sonnet v2 is currently our first choice for data extraction projects (e.g. as a part of business automation in manufacturing industries). Why is it so?

For example, imagine that you need to carefully extract product specification data for 20000 electrical components out of 1000 data sheets. These PDFs could include complex tables and even charts. Extracted data could then be used for comparing company products to the products of competitors, offering equivalent components in inline ads or driving supply chain decisions.

Anthropic Claude 3.5 Sonnet v2 has two nice features that work well together:

Native PDF Handling - we can now upload PDF files directly into the API along with the data extraction instructions. Under the hood, the Anthropic API will break the PDF down into pages and upload each page twice: as image and as text. This solution works well enough out of the box to replace previously complicated setups that used dedicated VLMs (Visual Language Models) running on local GPUs.

PDFs can consume a lot of tokens, especially when accompanied with a large system prompt. To speed up the processing, improve accuracy and lower costs we use two-level Prompt caching from Anthropic. This allows us to pay the full cost of PDF tokenisation only once.

Here is how our prompt can look for the data extraction:

System prompt: Your task is to extract product data from the PDF. Here is the schema (large schema) and company context.
Document prompt: Here is the PDF to extract the data from. It has multiple products. (large PDF)
Task: extract product X from the PDF.

This way we can extract multiple products from the single PDF (following the checklist pattern). System prompt (1) and Document prompt (2) will be cached between all extraction requests to the same PDF. System (1) will be cached between all requests for this type of PDF extraction in general.

Whenever a portion of the prompt is cached on the server - it costs less and runs faster. For example, 30-70% faster and 50-90% cheaper, as described in the documentation of Anthropic. In data extraction cases, cost savings tend to be closer to the upper end of that range.

This is how things look in action: 'Cache creation' indicates when part of the prompt is stored in the cache, and 'Cache read' shows when the cached prompt is reused, saving time and money.

There is a small caveat. Anthropic models don’t have Structured Output capability of OpenAI. So you would think that we can lose two amazing features:

Precise schema following
Ability to hardcode custom chain-of-thought process that will drive LLM through the data extraction process.

However, this is not the case! Structured Output is just an inference capability that drives constrained decoding (token selection) to follow the schema precisely. A capable LLM will be able to extract even a complex structure without it. And while doing so, it will follow chain-of-thought process encoded in the schema definition.

Anthropic Claude 3.5 Sonnet v2 certainly can perform that. And in 5-7% of cases that return slightly invalid schema, we can pass results to GPT-4o for the schema repair.

For reference, here is an example of Structured Output definition from one of the projects (image quality was lowered intentionally).

GPT-4o from November 20 - TOP 3

OpenAI didn’t bother to publish a proper announcement for this model (gpt-4o-2024-11-20 in the API). They have just tweeted the update:

The model deserves a special mention in our benchmarks. Compared to the previous GPT-4o v2/2024-08-06, the model shows noticeable improvement, especially in the Reason category.

You can also note the usual pattern of OpenAI with the models:

First, they release a new powerful model (GPT-4o v1 in this case)
Then they release the next model in the same family that is much cheaper to run
Finally, they improve the model to be better, while still running at lower costs.

Multiple Qwen models from Alibaba

Qwen 2.5 Coder 32B Instruct is a new model in Qwen family. It will first make you sad and then—glad.

The model itself can be downloaded from HuggingFace and run locally on your hardware.

The sad part is that this coding model performed poorly on our Code+Eng category of tasks. It was able to handle coding tasks, but failed to deal with more complex code review and analysis challenges. Besides, its reasoning is generally quite low - 46.

What would you expect from a model that is called “Coder”, right? And actually in coding this model is quite good. This model performed as well as Sonnet 3.5 in coding-only benchmark for complex text-to-SQL tasks (more about that later).

What is so good about this model, then? This coding-oriented model represents a new quality improvement for local models in category “It can run on A100/H100 GPU”!

By the way, it is interesting to note that a few other big quality improvements that pushed State of the Art for local models were also driven by Qwen.

It is also interesting that “o1-killer” from Qwen didn’t score that high on our benchmark. Qwen: QwQ 32B Preview was designed to push state of the art in reasoning capabilities. According to some benchmarks it did succeed. However, it doesn’t look like a fit for product tasks and business automation. Why? It talks too much and doesn’t follow the instructions.

For example, given the prompt below, that is also reinforced by a couple of samples:

You extract product properties from provided text. Respond in format: "number unit" or "N/A" if can't determine. Strip quotes, thousands separators and comments.

The model will tend to start the response this way:

Alright, I've got this text about an electric screwdriver,...

Even the tiny mistral-7b-instruct-f16 would answer precisely something like 1300 rpm.

This might seem like an unfair comparison for QwQ against a top model o1-preview. o1 has a chance to reason in private before providing its response (it uses reasoning tokens for that).

To make things more fair for the new generations of reasoning models, we will change things a bit in the next major update of our benchmark - models will be allowed to reason before providing an answer. Models that think too much will be natively penalised by their cost and huge latency.

LLM Benchmark v2

We’ve been running current version of the benchmark without major changes for almost a year and a half. Changes were avoided to keep benchmark results comparable between models and test runs.

However, a lot has changed in the landscape since July 2023:

Structured Outputs - allow us to define precise response format and even drive custom chain-of-thought for the complex tasks.
Multi-modal language models can handle images and audio in addition to text input. Image inputs are used heavily in document extraction.
Prompt caching shifts perspective for building RAG systems, running complex checklists or extracting data from a lot of documents.
New reasoning models allow us to push model performance forward by breaking down complex tasks into small steps and then investing (paid) time to think through them.

In addition to that, we’ve gained a lot more insights in building LLM-driven systems and added more cases to our AI portfolio.

It is time for a big refresh. The work on the TIMETOACT GROUP LLM Benchmark v2 has already started. We are expecting to publish the first draft report early next year.

The V2 benchmark will keep the foundations from v1 but will focus more on concrete AI Cases and new model capabilities. More charts are to be expected, too.

Gemini Experimental 1121 - Good, but “unobtanium”

Gemini Experimental 1121 is a new prototype model from Google. It is currently available in test environments like AI Studio or OpenRouter. This model doesn’t push state of the art for Gemini, but proves that the presence of Google in TOP-10 is not a lucky coincidence. It is the third Gemini model to be in TOP-10.

However, this model is currently impossible to use. It is provided for free but is heavily rate limited. It took 3 days and multiple API keys just to run a few hundred evals from our benchmark.

Text-to-SQL Benchmark

Neo4j has published a video from its NODES24 conference about benchmarking different LLMs in text-to-SQL and text-to-Cypher tasks.

An example of a Text-to-SQL task is when an LLM is used to translate human request into a complex query against company SQL database. It is used for self-service reporting. Text-to-Cypher is the same, but runs queries against graph databases like Neo4j.

The research and presentation was done in partnership with two companies from the TIMETOACT GROUP: X-Integrate and TIMETOACT GROUP Austria.

The most important slide of the presentation is the one below. It shows accuracy with which different LLMs have generated queries for a complex database. This DB held information about technical and organisational dependencies in the company for the purposes of risk management.

Basic” scores are the scores without any performance optimisations, while “Full” scores employ a range of performance optimisations to boost the accuracy of query generation.

You can learn more about these optimisations (and about the benchmark) by watching the presentation online on YouTube.

Some of these text-to-query tasks will even be included in our upcoming LLM v2 benchmark.

Transform Your Digital Projects with the Best AI Language Models!

Discover the transformative power of the best LLMs and revolutionize your digital products with AI! Stay future-focused, boost efficiency, and gain a clear competitive edge. We help you elevate your business value to the next level.

Martin Warnung

martin.warnung@timetoact.at

Blog

ChatGPT & Co: LLM Benchmarks for October

Find out which large language models outperformed in the October 2024 benchmarks. Stay informed on the latest AI developments and performance metrics.

Blog

ChatGPT & Co: LLM Benchmarks for September

Find out which large language models outperformed in the September 2024 benchmarks. Stay informed on the latest AI developments and performance metrics.

Blog

ChatGPT & Co: LLM Benchmarks for December

Find out which large language models outperformed in the December 2024 benchmarks. Stay informed on the latest AI developments and performance metrics.

Rinat AbdullinBlog

Blog

LLM Performance Series: Batching

Beginning with the September Trustbit LLM Benchmarks, we are now giving particular focus to a range of enterprise workloads. These encompass the kinds of tasks associated with Large Language Models that are frequently encountered in the context of large-scale business digitalization.

Rinat AbdullinBlog

Blog

Strategic Impact of Large Language Models

This blog discusses the rapid advancements in large language models, particularly highlighting the impact of OpenAI's GPT models.

Rinat AbdullinBlog

Blog

Open-sourcing 4 solutions from the Enterprise RAG Challenge

Our RAG competition is a friendly challenge different AI Assistants competed in answering questions based on the annual reports of public companies.

Aqeel AlazreeBlog

Blog

Part 1: Data Analysis with ChatGPT

In this new blog series we will give you an overview of how to analyze and visualize data, create code manually and how to make ChatGPT work effectively. Part 1 deals with the following: In the data-driven era, businesses and organizations are constantly seeking ways to extract meaningful insights from their data. One powerful tool that can facilitate this process is ChatGPT, a state-of-the-art natural language processing model developed by OpenAI. In Part 1 pf this blog, we'll explore the proper usage of data analysis with ChatGPT and how it can help you make the most of your data.

Jörg EgretzbergerBlog

Blog

8 tips for developing AI assistants

AI assistants for businesses are hype, and many teams were already eagerly and enthusiastically working on their implementation. Unfortunately, however, we have seen that many teams we have observed in Europe and the US have failed at the task. Read about our 8 most valuable tips, so that you will succeed.

Felix KrauseBlog

Blog

AIM Hackathon 2024: Sustainability Meets LLMs

Focusing on impactful AI applications, participants addressed key issues like greenwashing detection, ESG report relevance mapping, and compliance with the European Green Deal.

Matus ZilinskyBlog

Blog

Creating a Social Media Posts Generator Website with ChatGPT

Using the GPT-3-turbo and DALL-E models in Node.js to create a social post generator for a fictional product can be really helpful. The author uses ChatGPT to create an API that utilizes the openai library for Node.js., a Vue component with an input for the title and message of the post. This article provides step-by-step instructions for setting up the project and includes links to the code repository.

Martin WarnungBlog

Blog

Common Mistakes in the Development of AI Assistants

How fortunate that people make mistakes: because we can learn from them and improve. We have closely observed how companies around the world have implemented AI assistants in recent months and have, unfortunately, often seen them fail. We would like to share with you how these failures occurred and what can be learned from them for future projects: So that AI assistants can be implemented more successfully in the future!

Blog

Second Place - AIM Hackathon 2024: Trustpilot for ESG

The NightWalkers designed a scalable tool that assigns trustworthiness scores based on various types of greenwashing indicators, including unsupported claims and inaccurate data.

Blog

AI for social good

Discover how leading companies are already profiting from Gen AI!

Blog

SAM Wins First Prize at AIM Hackathon

The winning team of the AIM Hackathon, nexus. Group AI, developed SAM, an AI-powered ESG reporting platform designed to help companies streamline their sustainability compliance.

Blog

Third Place - AIM Hackathon 2024: The Venturers

ESG reports are often filled with vague statements, obscuring key facts investors need. This team created an AI prototype that analyzes these reports sentence-by-sentence, categorizing content to produce a "relevance map".

Rinat AbdullinBlog

Blog

Let's build an Enterprise AI Assistant

In the previous blog post we have talked about basic principles of building AI assistants. Let’s take them for a spin with a product case that we’ve worked on: using AI to support enterprise sales pipelines.

Rinat AbdullinBlog

Blog

The Intersection of AI and Voice Manipulation

The advent of Artificial Intelligence (AI) in text-to-speech (TTS) technologies has revolutionized the way we interact with written content. Natural Readers, standing at the forefront of this innovation, offers a comprehensive suite of features designed to cater to a broad spectrum of needs, from personal leisure to educational support and commercial use. As we delve into the capabilities of Natural Readers, it's crucial to explore both the advantages it brings to the table and the ethical considerations surrounding voice manipulation in TTS technologies.

Aqeel AlazreeBlog

Blog

Part 4: Save Time and Analyze the Database File

ChatGPT-4 enables you to analyze database contents with just two simple steps (copy and paste), facilitating well-informed decision-making.

Aqeel AlazreeBlog

Blog

Part 3: How to Analyze a Database File with GPT-3.5

In this blog, we'll explore the proper usage of data analysis with ChatGPT and how you can analyze and visualize data from a SQLite database to help you make the most of your data.

Rinat AbdullinBlog

Blog

So You are Building an AI Assistant?

So you are building an AI assistant for the business? This is a popular topic in the companies these days. Everybody seems to be doing that. While running AI Research in the last months, I have discovered that many companies in the USA and Europe are building some sort of AI assistant these days, mostly around enterprise workflow automation and knowledge bases. There are common patterns in how such projects work most of the time. So let me tell you a story...

Workshop

AI Workshops for Companies

Whether it's the basics of AI, prompt engineering, or potential scouting: our diverse AI workshop offerings provide the right content for every need.

Blog

Google Workspace: AI-supported work for every company

The future of work with Google Workspace and Google AI

Blog

Crisis management & building a sustainable future with AI

Non-profit organizations develop AI models to tackle global challenges - and draw lessons for businesses worldwide

Blog

The ROI of Gen AI

Discover how leading companies are already profiting from Gen AI!

TIMETOACT

Martin LangeBlog

Blog

License Management – Everything you need to know

License management is not only relevant in terms of compliance but can also minimize costs and risks. Read more in the article.

Kompetenz

Graph Technology

We help you harness the power of graphs to transform your business. Our expertise spans from graph database modelling and graph data science to generative AI.

Felix KrauseBlog

Blog

License Plate Detection for Precise Car Distance Estimation

When it comes to advanced driver-assistance systems or self-driving cars, one needs to find a way of estimating the distance to other vehicles on the road.

TIMETOACT

Technologie

IBM Cloud Pak for Data Accelerator

For a quick start in certain use cases, specifically for certain business areas or industries, IBM offers so-called accelerators based on the "Cloud Pak for Data" solution, which serve as a template for project development and can thus significantly accelerate the implementation of these use cases. The platform itself provides all the necessary functions for all types of analytics projects, and the accelerators provide the respective content.

Blog

Google Threat Intelligence

Threat intelligence at Google scale for you and your business!

Blog

Responsible AI: A Guide to Ethical AI Development

Responsible AI is a key requirement in the development and use of AI technologies. You can find everything you need to know here!

Rinat AbdullinBlog

Blog

5 Inconvenient Questions when hiring an AI company

This article discusses five questions you should ask when buying an AI. These questions are inconvenient for providers of AI products, but they are necessary to ensure that you are getting the best product for your needs. The article also discusses the importance of testing the AI system on your own data to see how it performs.

Aqeel AlazreeBlog

Blog

Database Analysis Report

This report comprehensively analyzes the auto parts sales database. The primary focus is understanding sales trends, identifying high-performing products, Analyzing the most profitable products for the upcoming quarter, and evaluating inventory management efficiency.

TIMETOACT

Referenz

Managed service support for optimal license management

To ensure software compliance, TIMETOACT supports FUNKE Mediengruppe with a SAM Managed Service for Microsoft, Adobe, Oracle and IBM.

TIMETOACT GROUP

Service

Service

AI & Data Science

The amount of data that companies produce and process every day is constantly growing. This data contains valuable information about customers, markets, business processes and much more. But how can companies use this data effectively to make better decisions, improve their products and services and tap into new business opportunities?

Daniel WellerBlog

Blog

Revolutionizing the Logistics Industry

As the logistics industry becomes increasingly complex, businesses need innovative solutions to manage the challenges of supply chain management, trucking, and delivery. With competitors investing in cutting-edge research and development, it is vital for companies to stay ahead of the curve and embrace the latest technologies to remain competitive. That is why we introduce the TIMETOACT Logistics Simulator Framework, a revolutionary tool for creating a digital twin of your logistics operation.

TIMETOACT

Referenz

Interactive online portal identifies suitable employees

TIMETOACT digitizes several test procedures for KI.TEST to determine professional intelligence and personality.

TIMETOACT

Referenz

Standardized data management creates basis for reporting

TIMETOACT implements a higher-level data model in a data warehouse for TRUMPF Photonic Components and provides the necessary data integration connection with Talend. With this standardized data management, TRUMPF will receive reports based on reliable data in the future and can also transfer the model to other departments.

Rinat AbdullinBlog

Blog

Using NLP libraries for post-processing

Learn how to analyse sticky notes in miro from event stormings and how this analysis can be carried out with the help of the spaCy library.

Branche

Artificial Intelligence in Treasury Management

Optimize treasury processes with AI: automated reports, forecasts, and risk management.

Laura GaetanoBlog

Blog

Using a Skill/Will matrix for personal career development

Discover how a Skill/Will Matrix helps employees identify strengths and areas for growth, boosting personal and professional development.

Sebastian BelczykBlog

Blog

Building A Shell Application for Micro Frontends | Part 4

We already have a design system, several micro frontends consuming this design system, and now we need a shell application that imports micro frontends and displays them.

TIMETOACT

Referenz

Flexibility in the data evaluation of a theme park

With the support of TIMETOACT, an theme park in Germany has been using TM1 for many years in different areas of the company to carry out reporting, analysis and planning processes easily and flexibly.

Referenz

Automated Planning of Transport Routes

Efficient transport route planning through automation and seamless integration.

Offering

Offering

Advanced Admin Trial

If you’re interested in Advanced Admin, you can test it for free for 14 days via the Marketplace.

Rinat AbdullinBlog

Blog

Innovation Incubator at TIMETOACT GROUP Austria

Discover how our Innovation Incubator empowers teams to innovate with collaborative, week-long experiments, driving company-wide creativity and progress.

Rinat AbdullinBlog

Blog

Innovation Incubator Round 1

Team experiments with new technologies and collaborative problem-solving: This was our first round of the Innovation Incubator.

Rinat AbdullinBlog

Blog

Announcing Domain-Driven Design Exercises

Interested in Domain Driven Design? Then this DDD exercise is perfect for you!

Christian FolieBlog

Blog

The Power of Event Sourcing

This is how we used Event Sourcing to maintain flexibility, handle changes, and ensure efficient error resolution in application development.

Jonathan ChannonBlog

Blog

Tracing IO in .NET Core

Learn how we leverage OpenTelemetry for efficient tracing of IO operations in .NET Core applications, enhancing performance and monitoring.

Rinat AbdullinBlog

Blog

Consistency and Aggregates in Event Sourcing

Learn how we ensures data consistency in event sourcing with effective use of aggregates, enhancing system reliability and performance.

Blog

My Workflows During the Quarantine

The current situation has deeply affected our daily lives. However, in retrospect, it had a surprisingly small impact on how we get work done at TIMETOACT GROUP Austria.

Rinat AbdullinBlog

Blog

Learning + Sharing at TIMETOACT GROUP Austria

Discover how we fosters continuous learning and sharing among employees, encouraging growth and collaboration through dedicated time for skill development.

Laura GaetanoBlog

Blog

My Weekly Shutdown Routine

Discover my weekly shutdown routine to enhance productivity and start each week fresh. Learn effective techniques for reflection and organization.

Ian RussellBlog

Blog

Introduction to Functional Programming in F# – Part 8

Discover Units of Measure and Type Providers in F#. Enhance data management and type safety in your applications with these powerful tools.

Ian RussellBlog

Blog

Introduction to Functional Programming in F# – Part 9

Explore Active Patterns and Computation Expressions in F#. Enhance code clarity and functionality with these advanced techniques.

Ian RussellBlog

Blog

Introduction to Functional Programming in F# – Part 10

Discover Agents and Mailboxes in F#. Build responsive applications using these powerful concurrency tools in functional programming.

Rinat AbdullinBlog

Blog

Process Pipelines

Discover how process pipelines break down complex tasks into manageable steps, optimizing workflows and improving efficiency using Kanban boards.

Ian RussellBlog

Blog

Introduction to Functional Programming in F# – Part 11

Learn type inference and generic functions in F#. Boost efficiency and flexibility in your code with these essential programming concepts.

Ian RussellBlog

Blog

Introduction to Functional Programming in F# – Part 12

Explore reflection and meta-programming in F#. Learn how to dynamically manipulate code and enhance flexibility with advanced techniques.

Bernhard SchauerBlog

Blog

ADRs as a Tool to Build Empowered Teams

Learn how we use Architecture Decision Records (ADRs) to build empowered, autonomous teams, enhancing decision-making and collaboration.

Peter SzarvasBlog

Blog

Why Was Our Project Successful: Coincidence or Blueprint?

“The project exceeded all expectations,” is one among our favourite samples of the very positive feedback from our client. Here's how we did it!

Nina DemuthBlog

Blog

They promised it would be the next big thing!

Haven’t we all been there? We have all been promised by teachers, colleagues or public speakers that this or that was about to be the next big thing in tech that would change the world as we know it.

Ian RussellBlog

Blog

Creating solutions and projects in VS code

In this post we are going to create a new Solution containing an F# console project and a test project using the dotnet CLI in Visual Studio Code.

Ian RussellBlog

Blog

Using Discriminated Union Labelled Fields

A few weeks ago, I re-discovered labelled fields in discriminated unions. Despite the fact that they look like tuples, they are not.

Ian RussellBlog

Blog

Ways of Creating Single Case Discriminated Unions in F#

There are quite a few ways of creating single case discriminated unions in F# and this makes them popular for wrapping primitives. In this post, I will go through a number of the approaches that I have seen.

Rinat AbdullinBlog

Blog

Part 1: TIMETOACT Logistics Hackathon - Behind the Scenes

A look behind the scenes of our Hackathon on Sustainable Logistic Simulation in May 2022. This was a hybrid event, running on-site in Vienna and remotely. Participants from 12 countries developed smart agents to control cargo delivery truck fleets in a simulated Europe.

Jonathan ChannonBlog

Blog

Understanding F# Type Aliases

In this post, we discuss the difference between F# types and aliases that from a glance may appear to be the same thing.

Bernhard SchauerBlog

Blog

Isolating legacy code with ArchUnit tests

Clear boundaries in code are important ... and hard. ArchUnit allows you to capture the structure your team agreed on in tests.

Jonathan ChannonBlog

Blog

Understanding F# applicatives and custom operators

In this post, Jonathan Channon, a newcomer to F#, discusses how he learnt about a slightly more advanced functional concept — Applicatives.

Nina DemuthBlog

Blog

From the idea to the product: The genesis of Skwill

We strongly believe in the benefits of continuous learning at work; this has led us to developing products that we also enjoy using ourselves. Meet Skwill.

Felix KrauseBlog

Blog

Creating a Cross-Domain Capable ML Pipeline

As classifying images into categories is a ubiquitous task occurring in various domains, a need for a machine learning pipeline which can accommodate for new categories is easy to justify. In particular, common general requirements are to filter out low-quality (blurred, low contrast etc.) images, and to speed up the learning of new categories if image quality is sufficient. In this blog post we compare several image classification models from the transfer learning perspective.

Rinat AbdullinBlog

Blog

State of Fast Feedback in Data Science Projects

DSML projects can be quite different from the software projects: a lot of R&D in a rapidly evolving landscape, working with data, distributions and probabilities instead of code. However, there is one thing in common: iterative development process matters a lot.

Felix KrauseBlog

Blog

Part 2: Detecting Truck Parking Lots on Satellite Images

In the previous blog post, we created an already pretty powerful image segmentation model in order to detect the shape of truck parking lots on satellite images. However, we will now try to run the code on new hardware and get even better as well as more robust results.

Felix KrauseBlog

Blog

Part 1: Detecting Truck Parking Lots on Satellite Images

Real-time truck tracking is crucial in logistics: to enable accurate planning and provide reliable estimation of delivery times, operators build detailed profiles of loading stations, providing expected durations of truck loading and unloading, as well as resting times. Yet, how to derive an exact truck status based on mere GPS signals?

Laura GaetanoBlog

Blog

5 lessons from running a (remote) design systems book club

Last year I gifted a design systems book I had been reading to a friend and she suggested starting a mini book club so that she’d have some accountability to finish reading the book. I took her up on the offer and so in late spring, our design systems book club was born. But how can you make the meetings fun and engaging even though you're physically separated? Here are a couple of things I learned from running my very first remote book club with my friend!

Ian RussellBlog

Blog

Introduction to Functional Programming in F# – Part 2

Explore functions, types, and modules in F#. Enhance your skills with practical examples and insights in this detailed guide.

Ian RussellBlog

Blog

Introduction to Functional Programming in F# – Part 3

Dive into F# data structures and pattern matching. Simplify code and enhance functionality with these powerful features.

Ian RussellBlog

Blog

Introduction to Functional Programming in F# – Part 4

Unlock F# collections and pipelines. Manage data efficiently and streamline your functional programming workflow with these powerful tools.

Ian RussellBlog

Blog

So, I wrote a book

Join me as I share the story of writing a book on F#. Discover the challenges, insights, and triumphs along the way.

Ian RussellBlog

Blog

Introduction to Functional Programming in F# – Part 7

Explore LINQ and query expressions in F#. Simplify data manipulation and enhance your functional programming skills with this guide.

Daniel PuchnerBlog

Blog

How we discover and organise domains in an existing product

Software companies and consultants like to flex their Domain Driven Design (DDD) muscles by throwing around terms like Domain, Subdomain and Bounded Context. But what lies behind these buzzwords, and how these apply to customers' diverse environments and needs, are often not as clear. As it turns out it takes a collaborative effort between stakeholders and development team(s) over a longer period of time on a regular basis to get them right.

Christian FolieBlog

Blog

Running Hybrid Workshops

When modernizing or building systems, one major challenge is finding out what to build. In Pre-Covid times on-site workshops were a main source to get an idea about ‘the right thing’. But during Covid everybody got used to working remotely, so now the question can be raised: Is it still worth having on-site, physical workshops?

Nina DemuthBlog