← Back to Newsletters
data subtldr week 41 year 2024
r/MachineLearningr/dataengineeringr/sql
Debates on Certification's Value in Data Engineering, Writing SQL Queries in Lower Case, Machine Learning Researchers Winning Nobel Prize for Physics, and AlphaFold's Recognition in Chemistry Nobel Prize
•Week 41, 2024
Posted in r/MachineLearningbyu/PrittEnergizer•10/8/2024
1123
[N] 2024 Nobel Prize for Physics goes to ML and DNN researchers J. Hopfield and G. Hinton
News
The 2024 Nobel Prize for Physics was awarded to Machine Learning (ML) and Deep Neural Network (DNN) researchers J. Hopfield and G. Hinton, sparking a mixed response from the Reddit community. While some celebrated the recognition of ML and DNN contributions, others expressed disappointment and concern, considering it a controversial encroachment into the realm of physics. There were calls for protests and concern about setting a worrying precedent. Some users suggested that this decision might overshadow the contributions of other significant figures in ML/DNN. A few comments even humorously suggested the next Chemistry Nobel Prize might go to DeepMind's protein folding. Overall, the sentiment leaned towards dissatisfaction with the decision due to perceived blurring of disciplinary boundaries.
Posted in r/dataengineeringbyu/ephemeral404•10/7/2024
749
Teeny tiny update only
Meme
The Reddit thread titled 'Teeny tiny update only' in the subreddit 't5_36en4' has attracted a number of humorous responses from the community. Suggestions range from not having a data schema to making a 10PB of NFS disk available and deleting all data monthly. One user pointed out the challenge of upstream changing the main entity's keys from synthetic to natural. Also, a user pointed out the additional difficulty of making these changes on a Friday afternoon or even Sunday night. A tool called SQLMesh was suggested for planning changes and inferring breaking changes automatically. The sentiment overall is light-hearted, with a hint of sarcasm and shared frustration.
Posted in r/MachineLearningbyu/aagg6•10/9/2024
411
[N] The 2024 Nobel Prize in Chemistry goes to the people Google Deepmind's AlphaFold. One half to David Baker and the other half jointly to Demis Hassabis and John M. Jumper.
News
The 2024 Nobel Prize in Chemistry has been awarded to Google Deepmind's AlphaFold project with half going to David Baker and the other half to Demis Hassabis and John M. Jumper. The Reddit thread mostly agrees on the contribution of AlphaFold, with many users stating they foresaw the project's recognition. However, there's a notable controversy regarding the inclusion of Hassabis, Google Deepmind's CEO. Skeptics question his direct involvement in the project, likening his Nobel recognition to a corporate executive winning the Nobel in Literature for a tool developed by their company. Despite this, others highlight Hassabis' technical background and suggest the debate reflects broader issues with attribution in scientific awards.
Posted in r/dataengineeringbyu/OpenWeb5282•10/13/2024
370
Good book for technical and domain-specific challenges for building reliable and scalable financial data infrastructures. I had read couple of chapter.
Discussion
The Reddit thread revolves around a book recommendation for building reliable and scalable financial data infrastructures. Top comments suggest that the book is particularly useful for individuals in banking, investment firms, financial data providers, and similar roles. There's speculation that the original poster (OP) may be the book's author, but regardless, the OP's history shows consistent book recommendations. Some users inquire about the book's applicability to other sectors like trading and accountancy, and even outside the financial sphere, such as in engineering. The OP provides a comprehensive index of the book, emphasizing its depth and coverage of financial data governance, audit, and retention. Overall, the sentiment is positive, with a sense of curiosity about the book's potential cross-domain applicability.
Posted in r/MachineLearningbyu/optimization_ml•10/9/2024
340
[N] Jurgen Schmidhuber on 2024 Physics Nobel Prize
News
The Reddit thread on Jurgen Schmidhuber's critique of the 2024 Physics Nobel Prize reveals mixed sentiments. Schmidhuber argues that the prize rewarded plagiarism and incorrect attribution, specifically referencing the lack of acknowledgment for Shun-Ichi Amari's work on the 'Hopfield network' and the 'Boltzmann Machine'. Some commenters support Schmidhuber, acknowledging that plagiarism is a serious issue and sympathizing with his frustration. Others criticize him, arguing that his research, while valuable, lacked practical applications compared to Hinton's. A few commenters also suggest that Schmidhuber's aggressive approach to these issues may have contributed to his undervaluation in the field. Overall, the thread reflects a lively debate on academic integrity and the politics of recognition in the field of machine learning.
Posted in r/MachineLearningbyu/haoyuan8•10/13/2024
339
[P] Drowning in Research Papers? 🐸
Project
The Reddit thread discusses Ribbit Ribbit, an AI research paper discovery tool built by two engineers. The tool curates personalized paper recommendations and summarizes them into tweet-sized content. The top comments include suggestions for improvements, such as adding more ways to sort papers, including by citation and conference, and being able to filter by sub-field. The creators clarified that users can customize their interests and assured that the model-generated paper understanding is reliable. Other users expressed interest in a feature to add papers to reference management software like Zotero. Overall, the sentiment was positive, with users appreciating the tool's novel approach to handling research paper consumption.
Posted in r/dataengineeringbyu/MasterKluch•10/11/2024
169
Hot Take: Certifications are a money grab and often overrated (preface - I took and failed the dbt analytics twice)
Discussion
The thread revolves around the topic of the value of certifications in the data engineering field. The original post by MasterKluch critiques certifications as overrated and a money grab, using his personal experience of failing the dbt analytics exam twice as context. Many contributors agree with the sentiment, citing their own experiences. They assert that practical experience holds more weight than certifications, which are seen as time-consuming and often irrelevant to real-world applications. Some users mentioned that certifications could be useful early in career or for specific company-related benefits. Several commenters also criticized the dbt and AWS certifications for their focus on obscure details and memorization over practical knowledge. The overall sentiment is of skepticism towards the true value of certifications.
Posted in r/dataengineeringbyu/thatsagoodthought•10/9/2024
112
Am I just doing it wrong?
Discussion
In a Reddit thread, a data engineer expressed concerns about being pressured to quickly deliver data extracts to senior management without adequate time for quality checks. The top comments suggested that rushing such tasks could lead to errors and advised the engineer to push back against unrealistic expectations. The commenters also highlighted the importance of quality assurance in data extraction, even in scenarios where speed is vital. They suggested that the engineer's role should focus on engineering and not managing end-user relationships, which could be handled by data analysts or report developers. The overall sentiment favored maintaining quality and accuracy over speed in data engineering tasks.