Brinda Gurusamy on LinkedIn: #llm #machinelearning #artificialintelligence #ai #ml (2025)

Brinda Gurusamy

Software Engineering, ML at Cisco | UC Berkeley

  • Report this post

🗞 The Unreasonable Ineffectiveness of the Deeper LayersThis paper explores layer-pruning strategies for open weight pretrained large language models. To prune these models, they use similarity across layers to identify the optimal block of layers to remove, followed by a healing step that involves a small amount of fine-tuning. Even after removing a significant portion of the model's layers, the models show minimal performance degradation on question-answering tasks. The study shows potential for reducing computing resources during fine-tuning and improving memory and latency constraints in inference. The robustness of LLMs to layer deletion may indicate inefficiencies in leveraging deeper layers or the critical role of shallow layers in knowledge retention.Three key takeaways:⚫ Layer pruning significantly reduces model memory footprint and inference time in proportion to the number of removed layers, while maintaining robust performance.⚫Layer pruning methods can complement other PEFT and quantization strategies to further reduce computational resources.⚫The model's resilience to deep layer removal and impact on downstream tasks emphasize the importance of shallow layers in retaining knowledge.Interesting questions that authors consider worthy of investigation:⏺ What are optimal layer-pruning strategies and effective healing approaches?⏺ How is knowledge distributed across layers, and how can LLMs utilize parameters in their deepest layers more effectively?🔗 https://lnkd.in/gGSWhx85#llm #machinelearning #artificialintelligence #ai #ml

9

1 Comment

Like Comment

Pete Grett

GEN AI Evangelist | #TechSherpa | #LiftOthersUp

6mo

  • Report this comment

Exciting insights on layer-pruning strategies for LLMs! Can't wait to see how this evolves the AI landscape. #continuouslearning Brinda Gurusamy

Like Reply

1Reaction 2Reactions

To view or add a comment, sign in

More Relevant Posts

  • HEMANTH LINGAMGUNTA

    "Pioneering Creativity and Innovation" I am a polymath—a lifelong learner with a deep and diverse understanding across multiple fields.

    • Report this post

    HEMANTH LINGAMGUNTA Unlocking the Temporal Dimension in AI: Aristotle's Paradox Meets Modern Machine LearningImagine AI models that don't just process time, but truly understand its enigmatic nature. By infusing Aristotle's temporal paradox into the DNA of LLMs, VLMs, and APIs, we're opening a portal to a new realm of artificial intelligence.Picture models that grasp the ephemeral present, the vanished past, and the potential future - not as mere data points, but as fluid concepts dancing on the edge of existence. This isn't just about improving time-stamped queries; it's about imbuing our digital creations with a philosophical understanding of time itself.From language models that craft narratives with a nuanced grasp of temporal flow, to vision systems that interpret the unfolding of events with unprecedented depth, we're not just advancing technology - we're teaching machines to ponder the very fabric of reality.This fusion of ancient wisdom and cutting-edge AI promises to revolutionize everything from predictive analytics to creative storytelling. Are we ready for AI that doesn't just process time, but contemplates its very nature?#AIPhilosophy #TemporalIntelligence #AristotleMeetsAICitations:[1] Large Language Models Can Learn Temporal Reasoning https://lnkd.in/gDbmWE7r[2] Temporal Reasoning in LLM - Chenhan Yuan https://lnkd.in/gGwiKdBn[3] Temporal quality degradation in AI models - Scientific Reports https://lnkd.in/gGiAVn3a[4] Physics (Aristotle) - Wikipedia https://lnkd.in/ghVbiYnz[5] REST API Examples: TimeSeries Queries - Product Documentation https://lnkd.in/g6REfmxK

    1 Comment

    Like Comment

    To view or add a comment, sign in

    • Report this post

    📐 Unveiling "Skywork-Math": Pioneering New Frontiers in Mathematical Reasoning with AI 🔍We are delighted to present "Skywork-Math," a pioneering study demonstrating significant advancements in the mathematical reasoning capabilities of Large Language Models (LLMs). This research from Skywork AI at Kunlun Inc. reveals exciting findings that could transform how we develop and utilize AI in domains demanding complex reasoning.Why This Paper is a Must-Read:➡️ Impressive Performance: Skywork-Math models achieve state-of-the-art results, surpassing early versions of GPT-4 in benchmarks like MATH and GSM8K, thanks to innovative supervised fine-tuning (SFT) strategies that leverage our unique 2.5M-instance Skywork-MathQA dataset.➡️ Advanced Data Techniques: The two-stage data synthesis and SFT pipeline, involving multiple augmentation methods, allow these models to tackle a broad spectrum of mathematical problems efficiently.Key Insights:➡️ Scaling with Quality and Quantity: Findings suggest that LLMs' mathematical abilities improve significantly with the scale of data, debunking the belief that large amounts of synthetic data do not substantially enhance reasoning capabilities.➡️ Practical Takeaways for AI Development: This paper provides actionable strategies to enhance the precision of mathematical reasoning in AI, making it a valuable resource for both academia and industry.🔥 Explore more cutting-edge strategies and network with top industry leaders at the DataHack Summit 2024. Join us in defining the new world order in Generative AI this August in Bengaluru: https://lnkd.in/gAsFp6w7#AnalyticsVidhya #GenerativeAI #ResearchPaper

    20

    Like Comment

    To view or add a comment, sign in

  • Jesse H.

    GenAI | ML/AI Engineering | Multi-Agent Systems | Knowledge Graphs

    • Report this post

    Graphs are so powerful to RAG systems. They allow for knowledge based entity relationships rather than bulk patterns in the language. Using them in conjunction with smart chunking and retrieval methods can make unbelievable differences in the way Gen AI can answer questions.

    3

    Like Comment

    To view or add a comment, sign in

  • Elvis S.

    Co-founder at DAIR.AI | PhD | Prev: Meta AI, Galactica LLM, PapersWithCode, Elastic | Creator of the Prompting Guide (5M+ learners)

    • Report this post

    Cool paper proposing a graph-based agent system to enhance the long-context abilities of LLMs. It first structures long text into a graph (elements and facts) and employs an agent to explore the graph using predefined functions guided by a step-by-step rational plan. The agent accesses coarse graph components and detailed text, takes notes, and reflects until enough information has been gathered to generate an answer.This approach helps to effectively and reliably generate answers to questions. Claims to consistently outperform GPT-4-128k across context lengths from 16k to 256k.You can never sleep on the power of graph or tree structures which in this case helps to capture long-range dependencies and multi-hop relationships within long text. Similar to other tree structures I have reported in the past paper tweets, the agents now get to leverage enriched information to solve tasks.https://lnkd.in/ecZVaun3↓For more, follow my weekly summary of the top AI and LLM papers. Read by 65K+ AI researchers and developers: https://lnkd.in/e6ajg945

    • Brinda Gurusamy on LinkedIn: #llm #machinelearning #artificialintelligence #ai #ml (12)

    182

    Like Comment

    To view or add a comment, sign in

  • Rajesh B

    QA specialist | Wi-Fi | EasyMesh | Audio Video Streaming | STB| IPTV | Deep Learning | NLP | AI / ML

    • Report this post

    Just finished the course “AI Text Summarization with Hugging Face” by Janani Ravi! Check it out: https://lnkd.in/g5WqJy8S #automatictextsummarization #artificialintelligence.Today's learning:AI Text Summarization with Hugging Face🚀 Just completed the "AI Text Summarization with Hugging Face" course! 📚 Here's a snapshot of the topics covered:📝 Extractive Summarization:Dive into extractive text summarization and explore intermediate representations.Learn about evaluation metrics and how to select transcript lines.💻 Hugging Face Exploration:Sign up for Hugging Face and discover the power of the sumy library for extractive summarization.Get hands-on with both extractive and abstractive summarization using Hugging Face.🤖 Transformers in Action:Gain insights into transformers, attention mechanisms, and sequence-to-sequence models.Work with Hugging Face Transformers using Colab and evaluate summaries with ROUGE scores.🔧 Fine-Tuning Models:Understand tokenizers, fine-tune T5 small model, and push it to the Hugging Face Hub.Summarize text using your fine-tuned model for personalized results.🌐 Exploring Different Transformers:Access datasets like BBC News summaries and generate diverse summaries using models like Pegasus and BART.Compute aggregate ROUGE scores to evaluate the quality of generated summaries.Excited to leverage these skills in AI text summarization! 🚀 #HuggingFace #TextSummarization #AI

    Certificate of Completion linkedin.com

    4

    Like Comment

    To view or add a comment, sign in

  • Michael(Mike) Erlihson

    Head of AI @ Stealth | PhD in Math | Scientific Content Creator & Lecturer | Podcast Host | Deep Learning & Data Science Expert | 250+ Deep Learning Paper Reviews | 25+ recorded DL podcasts | 52K+ followers |

    • Report this post

    🚀 📚 𝐓𝐡𝐞 𝐔𝐧𝐫𝐞𝐚𝐬𝐨𝐧𝐚𝐛𝐥𝐞 𝐈𝐧𝐞𝐟𝐟𝐞𝐜𝐭𝐢𝐯𝐞𝐧𝐞𝐬𝐬 𝐨𝐟 𝐭𝐡𝐞 𝐃𝐞𝐞𝐩𝐞𝐫 𝐋𝐚𝐲𝐞𝐫𝐬 🔥 🔎 ✅ Just finished an enlightening read: "The Unreasonable Ineffectiveness of the Deeper Layers," a study challenging prevailing assumptions about the utility of deeper layers in large language models (LLMs). ✅ The paper, authored by a team from Meta FAIR, Cisco, Zyphra, MIT, and Sequoia Capital, presents a novel layer-pruning strategy, revealing that many LLMs maintain robust performance even when up to half of their layers are removed.✅ This research pivots on a simple yet effective method to identify and prune redundant layers without compromising model performance significantly. Using techniques like parameter-efficient finetuning (PEFT), including quantization and Low Rank Adapters (QLoRA), the study meticulously demonstrates that significant computational resources can be conserved both during finetuning and inference phases.✅ A particularly striking takeaway is the discovery that not all layers contribute equally to the model’s capabilities. This finding beckons a reconsideration of how these AI models are structured and optimized. Furthermore, it suggests that current pretraining techniques might be underutilizing the potential of deeper network layers or that shallow layers play a more pivotal role than previously understood.✅ From a practical standpoint, the implications for AI deployment are profound, especially for applications requiring efficiency and speed without sacrificing output quality. For the scientific community, these results offer am interesting perspectivr to examine the architectural efficacy of neural networks.✅ For anyone involved in AI development or interested in the sustainable scaling of AI technologies, this paper can be very interesting. It not only challenges established paradigms but also opens up new avenues for making AI more accessible and efficient. #ai #machinelearning #deeplearning #artificialintelligence

    58

    3 Comments

    Like Comment

    To view or add a comment, sign in

  • Piotr Macai

    AI | Web | Automation Builder: macai.studio Help You Implement AI For Your Business Gen AI Tools Directory & Magazine: ainsider.tools

    • Report this post

    Ainsider Ai Newsletter vol.36 is live ⚡️Inside: ✔️ OpenAI released the ‘o1’ models✔️ Adobe released own Video Model✔️ Notebook LM from Google is insane for research and learning✔️ Last AI Tools added to Library Explore: https://lnkd.in/dhA_NrGH#ai #technology #artificialintelligence

    Ainsider Ai Newsletter vol.36 ainsider.beehiiv.com

    4

    Like Comment

    To view or add a comment, sign in

  • Lewis Walker ➲

    • Report this post

    𝗖𝘂𝘁 𝗟𝗟𝗠 𝗰𝗼𝗺𝗽𝘂𝘁𝗲 𝗯𝘆 𝟱𝟬% 𝘄𝗶𝘁𝗵 𝗠𝗶𝘅𝘁𝘂𝗿𝗲-𝗼𝗳-𝗗𝗲𝗽𝘁𝗵𝘀 💫↓ Highlights:➲ Transformer-based language models tend to evenly distribute computation across input sequences➲ With Mixture-of-Depths (MoD) additional computation is dynamically allocated to specific, more complex segments of sequences➲ MoD reduces computation by 50% during post-training sampling➲ It maintains accuracy comparable to baseline models➲ MoD accelerates processing speed by up to 50% in specific tasksLink to Google DeepMind paper in comments.‘𝗩𝗶𝗲𝘄 𝗺𝘆 𝗯𝗹𝗼𝗴’ ↑ for more Generative AI insights. #generativeai #artificialintelligence #deepmind

    • Brinda Gurusamy on LinkedIn: #llm #machinelearning #artificialintelligence #ai #ml (27)

    61

    1 Comment

    Like Comment

    To view or add a comment, sign in

  • Matthew Sinclair

    Partner and Vice President, Engineering

    • Report this post

    Here's everything that I found interesting in April 2024 about machines behaving intelligently:#qfm #quantumfaxmachine #blog #machine #intelligence #ai #genaiQFM013: Machine Intelligence Reading List April 2024https://lnkd.in/ezQQxd9m

    QFM013: Machine Intelligence Reading List April 2024 quantumfaxmachine.com

    17

    2 Comments

    Like Comment

    To view or add a comment, sign in

  • Aionlinecourse.com

    1,103 followers

    • Report this post

    10 standout AI & ML research papers 2024!! 🤖As we bid farewell to 2023, here are 10 standout research papers that shaped the landscape. From large language models to cutting-edge computer vision, these papers showcase innovation and impact. What's your favorite AI/ML moment this year? ✴️Pythia: Analyzing Large Language Models https://lnkd.in/gkFJzMth✴️Llama 2: Open Foundation and Chat Models https://lnkd.in/eedvCPu4✴️QLoRA: Efficient Finetuning of Quantized LLMs https://lnkd.in/e_W-k6UY✴️BloombergGPT: A Finance-Focused LLM https://lnkd.in/gGDmQtfM✴️Direct Preference Optimization: Streamlining Finetuninghttps://lnkd.in/gfbGCZMM ✴️Mistral 7B: Compact Powerhouse LLM https://lnkd.in/djykJDMp✴️Orca 2: Teaching Small LLMs How to Reasonhttps://lnkd.in/e33vH9-x ✴️ConvNets vs. Vision Transformers https://lnkd.in/e5NQ9GGb✴️Segment Anything: Meta's Image Segmentation https://lnkd.in/e42BZvYJ✴️Emu Video: Text-to-Video Synthesishttps://lnkd.in/gNUGeUYs Here's to a year of incredible advancements!🎉 #AI #ML #ResearchHighlights #Innovation #TechTrends #YearInReview

    • Brinda Gurusamy on LinkedIn: #llm #machinelearning #artificialintelligence #ai #ml (35)

    1

    1 Comment

    Like Comment

    To view or add a comment, sign in

Brinda Gurusamy on LinkedIn: #llm #machinelearning #artificialintelligence #ai #ml (37)

Brinda Gurusamy on LinkedIn: #llm #machinelearning #artificialintelligence #ai #ml (38)

1,904 followers

  • 34 Posts

View Profile

Follow

Explore topics

  • Sales
  • Marketing
  • IT Services
  • Business Administration
  • HR Management
  • Engineering
  • Soft Skills
  • See All
Brinda Gurusamy on LinkedIn: #llm #machinelearning #artificialintelligence #ai #ml (2025)
Top Articles
Latest Posts
Recommended Articles
Article information

Author: Virgilio Hermann JD

Last Updated:

Views: 6587

Rating: 4 / 5 (41 voted)

Reviews: 88% of readers found this page helpful

Author information

Name: Virgilio Hermann JD

Birthday: 1997-12-21

Address: 6946 Schoen Cove, Sipesshire, MO 55944

Phone: +3763365785260

Job: Accounting Engineer

Hobby: Web surfing, Rafting, Dowsing, Stand-up comedy, Ghost hunting, Swimming, Amateur radio

Introduction: My name is Virgilio Hermann JD, I am a fine, gifted, beautiful, encouraging, kind, talented, zealous person who loves writing and wants to share my knowledge and understanding with you.