← Back to Newsletters

data subtldr week 15 year 2024

r/MachineLearningr/dataengineeringr/sql

Databricks' Controversial Strategy Sparks Debate, Competitiveness of Top PhD Programs in Machine Learning, Job Market's Increasing Demands on Data Engineers, SAP's User Frustrations, and SQL Skills for CV: A Closer Look

Week 15, 2024
Posted in r/MachineLearningbyu/MLPhDStudent4/13/2024
552

[D] Folks here have no idea how competitive top PhD program admissions are these days, wow...

Discussion
The Reddit thread initiated by a Computer Science PhD student details the rising competitiveness in the admission process for top-tier PhD programs. The student emphasizes the importance of strong recommendation letters, personal connections to faculty, and multiple top conference papers. However, several commenters argue that attending such competitive institutions is not a prerequisite for successful research or high earnings in the industry. While some note that this high level of competition is specific to the U.S. and top-tier schools, others express concerns over the increasing expectations for prospective students. Some argue that valuable research often takes years and that the emphasis on producing numerous papers could lead to transient, incremental improvements rather than substantive scientific advancement. The overall sentiment is mixed, with many expressing concern about the high stress and competition, while others emphasize alternative routes to success in the field.
236 comments
Share
Save
View on Reddit →
Posted in r/dataengineeringbyu/_areebpasha4/11/2024
389

Common DE pipelines and their tech stacks on AWS, GCP and Azure

Discussion
The Reddit thread titled 'Common DE pipelines and their tech stacks on AWS, GCP and Azure' discusses the complexities of modern data engineering systems. Users expressed skepticism regarding the accuracy of the information presented, suggesting that it may be outdated or oversimplified. Some comments even compared the information to marketing gimmicks. A few users noted that the technology hasn't evolved significantly but is often repackaged or renamed, increasing complexities and costs. They also emphasized that simplified systems could accomplish the same tasks more cost-effectively. The thread's sentiment was generally critical of the complexity and inefficiency of current data engineering systems.
58 comments
Share
Save
View on Reddit →
Posted in r/dataengineeringbyu/dataengineeringdude4/10/2024
335

And so it begins. Databricks just couldn't help themselves. Get your wallets out.

Discussion
The Reddit thread is a discussion on Databricks' strategy, with the sentiment largely negative. Users express frustration at perceived greed, with one user predicting further feature offloading and renaming of packages, possibly in preparation for a public offering. There's also skepticism about Databricks' claim of being open. Some users suggest alternatives such as using Postgres and Python scripts, or running open-source Delta and Spark for a cheaper platform. A few comments discuss the role of sales engineers in tech, with one user noting they often lack technical expertise. Despite a supposed impending IPO, some users indicate frustration due to its delay. Overall, the thread highlights concerns about Databricks' future direction.
189 comments
Share
Save
View on Reddit →
Posted in r/MachineLearningbyu/RobbinDeBank4/12/2024
233

[D] Publication rat race for PhD in ML

Discussion
The Reddit thread titled [D] Publication rat race for PhD in ML discusses the issue of publication requirements for PhD programs in Machine Learning (ML) at top universities. Many users share their experiences and thoughts about the high expectations, with some expressing frustrations about the elitist nature of academia. Others advise about potential ways to meet these requirements, such as working as a research assistant during undergraduate studies, getting lucky with the professor you work with, or identifying clear goals beyond prestige and high-paying career paths. Some users also critique the culture of prioritizing publications over good research. The overall sentiment is a mix of resignation towards the competitive nature of the field and constructive advice for aspiring PhD candidates.
91 comments
Share
Save
View on Reddit →
Posted in r/MachineLearningbyu/Accomplished_Rest_164/13/2024
203

[D] Multiple first-author papers in top ML conferences, but still struggling to get into a PhD program. What am I missing?

Discussion
The Reddit thread revolves around a user (Accomplished_Rest_16) expressing frustration about not getting accepted into a PhD program despite having multiple first-author papers in top machine learning conferences and making significant contributions in the field. The top comments generally suggest the user widen his range of applications beyond the most competitive schools like Stanford and CMU, and improve social and communication skills. Some commenters point out that a lower GPA and international status could be limiting factors. There is also skepticism and doubt expressed about the user's claims, with some suggesting that the user's recommendation letters or statements of purpose could be weak, or there might be a red flag in their application. The overall sentiment is mixed, with a blend of advice, empathy, and suspicion.
132 comments
Share
Save
View on Reddit →
Posted in r/dataengineeringbyu/Capable-Jicama21554/12/2024
104

Job Market Slightly Better, but Employers are SO PICKY and Hiring Process TOO LONG!!

Career
The Reddit thread titled Job Market Slightly Better, but Employers are SO PICKY and Hiring Process TOO LONG!! generated a mixed sentiment among its participants. Most users expressed frustration over the evolving job market, with a noted increase in application-to-response ratios, longer hiring processes, and lowering salaries. The thread highlighted the challenges faced by data engineers, including a lack of new job postings, multiple hiring stages even for lesser-known companies, and a decrease in data literacy across industries. Some users suggested LinkedIn as a useful job hunting tool, while others shared their employment struggles, which ranged from rescinded offers to extended waiting periods for job results. Overall, the thread emphasized the increasing difficulties faced by job seekers in the current market.
54 comments
Share
Save
View on Reddit →