915
3 804 991

10:27

sea.dev | AI Launchpad '24

9:31

Phaselab | AI Launchpad '24

6:41

Parea | AI Launchpad '24

8:16

InQuery | AI Launchpad '24

9:12

Dataland | AI Launchpad '24

12:52

Building an Ecosystem for Open Foundation Models, Together

In this talk, Ce Zhang shares experiences in building the open source foundation model ecosystem through collaboration with the community. He delves into how balancing data quality, model architecture and infrastructure presents both opportunities and challenges. He also discusses navigating the extensive scale and cost of GPU clusters and optimizing their usage. Most importantly, he explores how data quality can be reasoned about in a structured manner to boost model quality.
This video provides a unique perspective on managing technical issues in open source ecosystems and is a must-watch for those interested in understanding the behind-the-scenes of data science and AI development.
👉 Sign up for our "No BS" Newsletter to get the latest technical data & AI content: hubs.li/Q02vz6xC0
#opensource #gpu #dataquality
ABOUT DATA COUNCIL:
Data Council brings together the brightest minds in data to share industry knowledge, technical architectures and best practices in building cutting edge data & AI systems and tools.
FIND US:
Twitter: datacouncilai
LinkedIn: www.linkedin.com/company/datacouncil-ai/
Website: www.datacouncil.ai/

Відео

10:27

Stochastic | AI Launchpad '24

Переглядів 2502 місяці тому

Stochastic is an end-to-end AI platform for enterprise knowledge work that provides personalized AI agents with zero setup or coding. ABOUT THE SPEAKER: Glenn Ko, Co-founder & CEO, Stochastic AI LAUNCHPAD: Data Council Zero Prime Ventures partnered to give six AI-first startups a chance to present brief demos on stage to top investors and elite founders during Data Council's annual conference i...

9:31

sea.dev | AI Launchpad '24

Переглядів 2602 місяці тому

sea.dev is breaking the constraints of existing data systems and NL2SQL with graph-based tools to allow LLM apps to reliably act on fintech data ABOUT THE SPEAKERS: Matt Arderne, Co-founder, sea.dev Marya Bazzi, Co-founder, sea.dev Vladimirs Murevics, Co-founder, sea.dev AI LAUNCHPAD: Data Council Zero Prime Ventures partnered to give six AI-first startups a chance to present brief demos on sta...

6:41

Phaselab | AI Launchpad '24

Переглядів 872 місяці тому

Phaselab builds smart automation to make companies’ data privacy programs more effective and efficient. ABOUT THE SPEAKER: Josh Schwartz, Co-founder & CEO, Phaselab AI LAUNCHPAD: Data Council Zero Prime Ventures partnered to give six AI-first startups a chance to present brief demos on stage to top investors and elite founders during Data Council's annual conference in Austin 2024. 👉 Sign up fo...

8:16

Parea | AI Launchpad '24

Переглядів 2782 місяці тому

Parea builds developer tools for evaluating, testing and monitoring LLM-powered applications. ABOUT THE SPEAKER: Joel Alexander, Co-founder, Parea AI LAUNCHPAD: Data Council Zero Prime Ventures partnered to give six AI-first startups a chance to present brief demos on stage to top investors and elite founders during Data Council's annual conference in Austin 2024. 👉 Sign up for our “No BS” News...

9:12

InQuery | AI Launchpad '24

Переглядів 2362 місяці тому

InQuery simplifies data lakehouse maintenance, saving your data team time and money. ABOUT THE SPEAKERS: Erick Enriquez, Co-founder & CEO, InQuery Khalil Miri, Co-founder & CTO, InQuery AI LAUNCHPAD: Data Council Zero Prime Ventures partnered to give six AI-first startups a chance to present brief demos on stage to top investors and elite founders during Data Council's annual conference in Aust...

12:52

Dataland | AI Launchpad '24

Переглядів 1992 місяці тому

Dataland is the AI-powered internal tools platform. It is the easiest way to deliver high-quality internal tools to your business users ABOUT THE SPEAKER: Arthur Wu, Co-founder, Dataland AI LAUNCHPAD: Data Council Zero Prime Ventures partnered to give six AI-first startups a chance to present brief demos on stage to top investors and elite founders during Data Council's annual conference in Aus...

Rising Tides with Radical Transparency: Why and How to Open Source Your Data Platform

33:09

Rising Tides with Radical Transparency: Why and How to Open Source Your Data Platform

Переглядів 1252 місяці тому

Join Tim Castillo from Dagster Labs for an insightful journey into how their data platform became successfully open-sourced. Discover the hurdles, cultural shifts and innovative implementations behind this strategic decision. Data engineers, analytics engineers and data platform engineers - learn how to leverage open source to enhance your projects and contribute to the data community. 👉 Sign u...

Case Studies from a Methodologist on an Experimentation Platform

29:32

Case Studies from a Methodologist on an Experimentation Platform

Переглядів 3082 місяці тому

Dive into the world of A/B testing with Microsoft's Experimentation Platform Team. Join Laura Cosgrove for an exclusive tech talk where she uncovers the secrets behind Microsoft’s cutting-edge statistical evaluation and simulation frameworks. In this video, discover the power of Microsoft's variance reduction estimator and its game-changing impact on service efficacy. Ready to elevate your A/B ...

A 101 in Time Series Analytics with Apache Arrow, Pandas and Parquet

31:42

A 101 in Time Series Analytics with Apache Arrow, Pandas and Parquet

Переглядів 1 тис.2 місяці тому

Dive deep into the world of databases and analytics in this talk from Zoe Steinkamp of InfluxData. Learn how you can unleash the potential of Apache Arrow and Apache Parquet for efficient, scalable handling of time-series data. Equip your toolbox with cutting-edge open-source technologies and industry-standard analytics libraries to build the foundation of a high performance analytics applicati...

Unified Stream/Batch Execution with Ibis

33:58

Unified Stream/Batch Execution with Ibis

Переглядів 5192 місяці тому

This talk is a deep dive exploration into the powerful world of Ibis, as Voltron Data showcases their recent work merging batch and streaming concepts and introducing an Apache Flink backend. This comprehensive tutorial will provide you with invaluable insights for working with data across a variety of platforms. Watch the full video to explore the potential of a unified approach for both batch...

How Beam Uses Code-Based Dashboards to Scale Analytics Products

23:17

How Beam Uses Code-Based Dashboards to Scale Analytics Products

Переглядів 3172 місяці тому

In this talk, Emilio Tamez unravels the magic behind dashboards-as-code. From Python scripts to modular design, Beam is breaking down the barriers between complexity and simplicity. The dashboards-as-code methodology has allowed Beam to incrementally approach their goals by building boilerplate dashboards as a series of code-defined, standardized modules which can be arranged into a dashboard i...

Building Responsible and Trustworthy Generative AI Products at LinkedIn

35:26

Building Responsible and Trustworthy Generative AI Products at LinkedIn

Переглядів 5342 місяці тому

Dive into the heart of LinkedIn's commitment to ethical AI development, where revolutionary Generative AI meets responsibility. Listen in to this insightful exploration as Daniel Olmedilla unveils the foundational principles and architecture guiding LinkedIn's AI journey. With a special focus on their cutting-edge Generative AI products and features, this talk gives an exclusive look into Linke...

What Makes for an Effective Data Practitioner in 2024?

31:16

What Makes for an Effective Data Practitioner in 2024?

Переглядів 4062 місяці тому

Listen in as Marck Vaisman shares insights from his years of experience and demystifies the complexities of the data practitioner role, while providing a roadmap for skill development across all levels. Whether you're a seasoned leader aiming to upskill your team or a novice stepping into the realm of data, this video offers valuable guidance to propel your career in the right direction. 👉 Sign...

28:46

Is Kubernetes a Database?

Переглядів 5012 місяці тому

Uncover how Kubernetes extends beyond stateless apps and now supports stateful workloads and database management with Custom Resources. In this video, discover the potential to eliminate traditional databases by transforming the Kubernetes API into a potent database and metastore. Don't miss this chance to learn how leveraging Kubernetes can revolutionize your tech projects. 👉 Sign up for our "...

How Developers Should Think About the Emerging AI Stack | Together, Pinecone, Anthropic

42:00

How Developers Should Think About the Emerging AI Stack | Together, Pinecone, Anthropic

Переглядів 5722 місяці тому

How Developers Should Think About the Emerging AI Stack | Together, Pinecone, Anthropic

From Playgrounds to Production: The Evolution of AI Evaluation at Coda

29:13

From Playgrounds to Production: The Evolution of AI Evaluation at Coda

Переглядів 982 місяці тому

From Playgrounds to Production: The Evolution of AI Evaluation at Coda

24:26

Events Sourcing with Kafka at Scale

Переглядів 1482 місяці тому

Events Sourcing with Kafka at Scale

Creating a Competitive Advantage in the Age of Intelligence as a Service

11:39

Creating a Competitive Advantage in the Age of Intelligence as a Service

Переглядів 1052 місяці тому

Creating a Competitive Advantage in the Age of Intelligence as a Service

Build Faster, More Responsive Analytics with a Semantic Layer | Cube Workshop

22:42

Build Faster, More Responsive Analytics with a Semantic Layer | Cube Workshop

Переглядів 2762 місяці тому

Build Faster, More Responsive Analytics with a Semantic Layer | Cube Workshop

Streaming CDC data from PostgreSQL to Snowflake, challenges and solutions

29:39

Streaming CDC data from PostgreSQL to Snowflake, challenges and solutions

Переглядів 4562 місяці тому

Streaming CDC data from PostgreSQL to Snowflake, challenges and solutions

35:54

OttoBot: Productionizing LLM Models

Переглядів 1452 місяці тому

OttoBot: Productionizing LLM Models

Building a User-Level Targeting Platform

25:32

Building a User-Level Targeting Platform

Переглядів 1362 місяці тому

Building a User-Level Targeting Platform

Data Culture 2.0: Leveraging AI to Build Human Connections and Expand Your Influence

28:39

Data Culture 2.0: Leveraging AI to Build Human Connections and Expand Your Influence

Переглядів 972 місяці тому

Data Culture 2.0: Leveraging AI to Build Human Connections and Expand Your Influence

Beyond Kafka: Cutting Costs and Complexity with WarpStream and S3

27:56

Beyond Kafka: Cutting Costs and Complexity with WarpStream and S3

Переглядів 2622 місяці тому

Beyond Kafka: Cutting Costs and Complexity with WarpStream and S3

Ten Years of Building Open Source Standards

37:44

Ten Years of Building Open Source Standards

Переглядів 2482 місяці тому

Ten Years of Building Open Source Standards

Move Fast and Don't Break Things -- How to Build a Data Platform that Scales with your Organization

35:02

Move Fast and Don't Break Things -- How to Build a Data Platform that Scales with your Organization

Переглядів 3102 місяці тому

Move Fast and Don't Break Things How to Build a Data Platform that Scales with your Organization

Redefining Database Workloads: The Future with Modern Object Storage

12:17

Redefining Database Workloads: The Future with Modern Object Storage

Переглядів 962 місяці тому

Redefining Database Workloads: The Future with Modern Object Storage

Beyond MLOps: Building AI systems with Metaflow

39:16

Beyond MLOps: Building AI systems with Metaflow

Переглядів 6272 місяці тому

Beyond MLOps: Building AI systems with Metaflow

How to Align AI Capabilities with Product Strategy so You Can Innovate

18:04

How to Align AI Capabilities with Product Strategy so You Can Innovate

Переглядів 2162 місяці тому

How to Align AI Capabilities with Product Strategy so You Can Innovate

КОМЕНТАРІ

@Anhar001 7 годин тому
all this jank just to solve the issue which is basically Python. Just write a fully statically compiled binary and shove that on a NFS, then just use rsync between dev machines and NFS. Have a shell script watch binary file changes and relaunch when file is changed. Look ma, I just replaced entire solid with a few bash scripts 😂
@jimshtepa5423 16 годин тому
10:55 what's wrong with uzbekistan?))))
@krishnapraveen777 21 годину тому
Chad engineer
@hemantishwaran5741 День тому
It’s great for ggplot and webpages. But if you ever write a textbook go straight to latex from the command line.
@malware_creations2606 День тому
Also I've read the Kafka has an issue with consumer lag. How do you handle those ?
@zuowang5185 11 днів тому
Is there an updated version of the logging pipeline 4 years later?
@bluejinux 14 днів тому
One of the best presentations on what purpose of data warehouse and data lakehouse and where the future is going for data.
@randomhandle307 17 днів тому
Very nice. Thanks
@AndreaMontes_ 20 днів тому
I'm rewatching this talk, the speaker is quite good. Taking some notes to prepare my own talk
@hannahnelson4569 25 днів тому
Very cool talk! The idea of learning hueristics was very cool! I didn't quite understand how the criterion for splitting down multiple paths! I will check out the source code! Thank you for hosting this talk!
@fb-gu2er 28 днів тому
Backend in Python? Yikes
@guykerem7874 29 днів тому
One of the best talks on data in 2024. Thank you Abhi! You never miss a chance to inspire and impress
@tessafelice2181 Місяць тому
I love the name mother duck. I feel it’s a respectful tribute to the female source of life and code.
@CreativeInspireP380 Місяць тому
This was an extremely informative talk - especially the section on challenges - and one I wish would receive more attention due to how useful it is as an overview to quite a few complex and highly relevant issues. It would be nice if it were re-elaborated and presented in a non-live presentation format.
@the-ghost-in-the-machine1108 Місяць тому
thanks
@nosh3019 Місяць тому
Great talk 🎉
@jayleejw1801 Місяць тому
The amount of background noise in this video is absurd.
@tratkotratkov126 Місяць тому
Great, very much needed and promising project ! However, it is not quiet clear what do you mean when you are talking about data versioning (DV) - do you version the data as LakeFS does or you are just versioning the source code which is producing this data. Also the diagrams in the presentation (Virtual/Physical layers) I find confusing and not easy to grasp at first glance. It will be nice in the next iteration if you use some real world/practical entities to describe demo objects like customer, product, sales etc. instead of just “source” and wrap the demo in some quick story like “Meet Alex, the data engineer at TechCorp, a rapidly growing tech company. Alex is responsible for managing the company’s data pipelines, ensuring that data from various sources is clean, consistent, and available for analysis” etc. you got the idea. Finally I would suggest you switch the sequence and the time you spend on the theory and the demo part - show your fantastic open source project demo first and how easy is implementing the 3 concepts in meaningful story then after each segment just mention the theoretical part, but don’t allow the theory to consume 75% of your presentation unless you want to be considered as one of the many Data Governance “gurus” which are presenting on this channel. Whishing you all good luck with this fantastic project !
@LucasCardoso-mw4ok Місяць тому
Hi! Nice video. I'm a little concerned about how I can get my development data from Copilot.
@KC53557 Місяць тому
A good example of not getting AI right is the creation of the Maga loon and Jan 6.
@68sahil56 Місяць тому
30:29
@68sahil56 Місяць тому
18:19
@VipulVaibhaw Місяць тому
Fantastic talk!
@allthingsdata Місяць тому
Loved it.
@AshishKumar-ll2mt Місяць тому
Looks like this field never took off the way it should have
@yogeshbharadwaj6200 Місяць тому
Very nice demo..Tks..
@compilation_exe3821 Місяць тому
Amazing
@timothymcglynn1935 Місяць тому
HI 👋
@HikarusVibrator Місяць тому
If someone can explain to me how you’re supposed to do a major version DB upgrade with a Debezium connector. It’s such an unbelievable pain that it’s a total dealbreaker. Unless I’m missing something
@Eriddoch Місяць тому
Dang, Miriah you are an AMAZING speaker, and as someone who works on data engineering systems but doesn't own them (MLOps), this is really valuable.
@420_gunna Місяць тому
bullshit buzzwords "cognitive analytics" vomit and a saccharine exhortative tone "quantum computing + graphene + ai" come on
@paoloogr 2 місяці тому
Nice talk! Thanks.
@ex-cursion 2 місяці тому
I loved this and wish there was more of it. Thank you! But as noted: 'invoice reconciliation is boring'. I feel like the survival of our species will pivot not on our curiosity, but on our capacity to constrain our desire for novelty enough to solve boring problems.
@matthewborn 2 місяці тому
This is an excellent talk. Thank you, Abhi!
@malcolmgdavis 2 місяці тому
Pointer vs. Value discussion: Based on the Method vs. Function discussion, ADT should be strictly adhered to. Operations that modify the ADT are modeled as functions that take the old state as an argument and return the new state as part of the result. In other words, a function should enforce immutability. The ADT approach helps with concurrency, making the code cleaner and easier to read. As an API user, I shouldn't worry about the state changing when I pass a structure. Of course, the pure ADT model's problem is memory consumption. That's why ADT models are generally implemented in VMs that can routinely find old structures without references and remove them from memory.
@malcolmgdavis 2 місяці тому
The method vs. function debate is absurd. The presenter needs to learn or spend time with OO programming. Class methods don't have to be logically connected to states. I developed in C during the 80s. The problem with structs is that the data is the point of coupling. The class hides data. In OO, the focus is on behavior and not the state. The OO state can be anywhere and can change. The strategy allows the implementation of the module to be changed without disturbing the client programs.
@1988YUVAL 2 місяці тому
Very interesting presentation. Looks like a very well thought out solution for managing data transformations. I wonder if it will take off like dbt.
@Jack-lg9mq 2 місяці тому
Good presentation. Also nice to see that Jimmi Simpson is expanding his horizons.
@mattbahr228 2 місяці тому
Awesome presentation!
@wonlee4138 2 місяці тому
Thanks for the great presentation!
@prashant776 2 місяці тому
Really good and informative. I congratulate PeerDB for their recent seed round secured . I see there is a lot of potential in PeerDB where organisations are looking to stream their data to warehouse. I have had a very unique need , I wish PeerDB was a wonderful choice back then.
@AndreaMontes_ 2 місяці тому
Great speaker 👏👏
@thrawn01 2 місяці тому
This was super useful, I learned a lot, Thank you!
@IbraheemFaiq 2 місяці тому
Great
@samhughes1747 2 місяці тому
I really enjoyed this. It was high-level, but hey, a hype-free, facts-only talk about working with generative models? I'll take it!
@Shikara_Animals 2 місяці тому
Best teacher ❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤❤
@VijayasarathyMuthu 2 місяці тому
You should include LightDash
@whatSriBishnusRajDharmaN-ek1hl 2 місяці тому
mother chods what doing here canot learn me detect leran mine concern your life risk at usa houston
@clarkylifehacks8220 2 місяці тому
This is great. Not the same context (not data), but I do 3 of the 4 roles under incident management, it can get messy!
@HwansungMedicalCharitySe-pn4vf 2 місяці тому
Beautiful topic, ugly tune. Reason for less likes. Suggestion: improve your tune and try to relax and be calmed.