← Back to Newsletters
data subtldr week 34 year 2023
r/MachineLearningr/dataengineeringr/SQL
Excel Embrace for SQL Pros, HuggingFace's Expansion Woes, Data Engineer's Transition to Big Tech, Databricks Certification Journey, Misleading SQL Cheat Sheet
•Week 34, 2023
Posted in r/dataengineeringbyu/little_dog_lover•8/21/2023
218
Me around Customer Success
Meme
The Reddit thread titled Me around Customer Success in the data engineering subreddit addresses the often overlooked importance of backend data teams in businesses. Top comments highlight the need for these teams to communicate their wins to avoid being seen as a cost center, and to adopt a 'data as a product' approach. Commenters also express frustration at the perception of IT being a cost while sales are seen as profit generators, despite sales depending on IT/data infrastructure. They draw analogies with a restaurant's kitchen staff, suggesting their crucial yet undervalued role similar to IT in a business. The overall sentiment leans towards advocating for a shift in business attitudes towards IT/data teams.
Posted in r/SQLbyu/True_Sloth•8/23/2023
199
Finally got a job as a data analyst, but I'll be using Excel 90% of the time instead of SQL which I am 10x better at.
Discussion
The Reddit thread from a new data analyst who will be using Excel instead of SQL, despite being more proficient in the latter, has generated supportive responses from the community. The top comments emphasize the importance and ubiquity of Excel, with one user sharing news about Python being introduced in Excel, potentially bridging the gap. Another user suggests exploring Power Query and Power Automate to leverage the analyst's existing SQL and Python skills within Excel. Overall, the sentiment is positive, encouraging the original poster to see this as an opportunity to broaden their skill set and become more versatile.
Posted in r/dataengineeringbyu/Background_Debate_94•8/25/2023
199
Just got certified! - Databricks Certified Data Engineer Professional
Career
The Reddit thread is about a user, Background_Debate_94, who successfully completed the Databricks Data Engineering Professional certification and shares the resources used. Comments mainly focus on the user's experience, preparation time, and relevance of the certification for his new job as a Data Engineer. Users ask about the user's real-life experience with Databricks and the estimated time it took to prepare for the exam. The sentiment in the thread is generally positive, with users congratulating the author and showing interest in his journey. The author clarifies that the certification might have played a role in landing his first Data Engineer job, where Databricks is primarily used.
Posted in r/MachineLearningbyu/andi_cs1•8/25/2023
186
[D] Is it me or HuggingFace do TOO MANY things?
Discussion
The Reddit thread discusses the complexity and expansion of the HuggingFace ecosystem. Users express frustration about the platform's lack of direction, apparent bloat, and departure from its open-source roots. They criticize the company for branching out into various libraries, resulting in lower quality and difficult bug reporting. There's also a concern over tightly coupled code, particularly within the HuggingFace trainer, making changes challenging. However, some users acknowledge the platform's role in democratizing machine learning and expect improvements as more engineers join the community. One commenter highlighted the challenge of balancing open-source ideals with sustaining as an independent company.
Posted in r/dataengineeringbyu/Raydox328•8/22/2023
171
I am a 10 YOE (SSIS/low-code) DE preparing to transition into tier 1 tech companies. Here's my study plan in case it helps someone else.
Interview
The Reddit thread is about a data engineer (DE) with 10 years of experience sharing their study plan for transitioning into tier 1 tech companies. The plan covers topics like DS & Algorithms, System Design, Product Sense, Data Modeling, ML Concepts, and Cloud. Some top comments expressed the intense preparation required for tech jobs compared to other white-collar jobs, with one user noting the high demands in law and medicine. Another user pointed out that the DE's earnings were already impressive in low-code. The DE, in response to comments, clarified that the rigorous preparation was to catch up with industry standards and transition into an individual contributor role. The overall sentiment was a mix of sympathy for the intense preparation required and respect for the DE's initiative and ambition.
Posted in r/MachineLearningbyu/Wiskkey•8/21/2023
152
[R] Beyond Surface Statistics: Scene Representations in a Latent Diffusion Model. Paper quote: "Using linear probes, we find evidence that the internal activations of the LDM [latent diffusion model] encode linear representations of both 3D depth data and a salient-object / background distinction."
Research
The Reddit thread discusses a research paper on Latent Diffusion Models' (LDMs) ability to internally represent 3D depth data and object-background distinctions. The top comments reveal mixed reactions. One commenter finds it unsurprising, citing past evidence of GANs doing similar things. Another user finds the concept stunning, raising questions about potential elusive consciousness within AI. A third user argues that while the AI's ability to model 3D geometry is impressive, using the term understanding might be misleading as it implies intentionality that the AI lacks. Another comment dismisses the idea of AI consciousness as a massive leap based on the research findings. Overall, users seem to be impressed and intrigued, but also cautious about over-attributing cognitive abilities to AI.
Posted in r/MachineLearningbyu/AIsupercharged•8/25/2023
149
Tech Giants Invest $235 Million in AI Startup Hugging Face [N]
News
The Reddit thread on AI startup Hugging Face's $235 million funding round from tech giants shows mixed sentiments. While recognizing the company's unique collaborative approach and its commitment to supporting developers, users express concern about the potential for monopolization in the AI space. They warn that despite its open-source ethos, Hugging Face could become restrictive and follow a similar path to OpenAI. There are calls for more competition and skepticism about how Hugging Face generates revenue. The thread underscores the ambivalence within the AI community about the consolidation of power and resources into a single platform, despite the potential benefits for developers.
Posted in r/SQLbyu/onurbaltaci•8/24/2023
45
I recorded a SQL Interview Exercise (10 Query Questions & Solutions) video and uploaded it on YouTube
MySQL
The Reddit thread discusses a SQL Interview Exercise video uploaded by user 'onurbaltaci'. It has received a positive response from the community with an upvote ratio of 0.96. Five comments were highlighted, all expressing gratitude towards the author for sharing the video. Users 'gordonwhims' and 'classicdannie' particularly appreciated the quality of the video, while 'InAmericaNumber1' thanked the author for creating it. The author responded to the comments, thanking the viewers for their engagement. Overall, the thread reflects a positive sentiment towards the SQL Interview Exercise video.
Posted in r/SQLbyu/IdoubledareU31•8/23/2023
44
I did a beginner cheat sheet about JOINs
Discussion
The Reddit thread, titled 'I did a beginner cheat sheet about JOINs' by IdoubledareU31, features a user-created guide to SQL joins. The overall sentiment in the comments is critical, with the top comments pointing out inaccuracies and misleading representations. 'coffeewithalex' argues that JOIN statements are not set operations and hence, Venn diagrams are inappropriate for visualizing them. 'Oxford89' advises against beginners creating such resources due to potential errors, while 'r3pr0b8' corrects that 'left (inner) join' and 'right (inner) join' are outer joins. 'Professional_Shoe392' suggests another source for visual representation of joins. The thread, though well-intended, is largely seen as misleading for beginners.