ETRI Knowledge Sharing Platform : Kwon Yongin

BROWSE

Researchers

연구자 검색
Keyword

Detail

Kwon Yongin Senior Researcher

Department: On-Device System Software Research Section

Contact

KSP Keywords: Deep Learning Models Deep Neural Network(DNN) Efficient Code Neural Processing Compiler Optimization Auto-Tuning Language Models Multi-Level Artificial Intelligence Mobile Devices Convolution Neural Network(CNN) Model Parameter Speed-Up Real-Time Object Detection Preference Inference Image Classification Optimal Kernel Quantization Method Auto-Tune Efficient Model Speech Recognition Graphic Processing Unit(GPU) Sequence-Based User'S Preference Target Accuracy Computational Requirements Computational Efficiency Image Sequences RISC-V Code Generation Execution Speed Model Compression Object Manipulation Resource-Constrained Energy Efficiency Ai Applications Resource Constraints State-Of-The-Art Execution Time Double Buffering Visual Preference Neuromorphic Computing Tuning Process Structural Information Intermediate Representation Tuning Time Transformer-Based Accuracy Requirements Gpu Memory Usage Multiple Gpus Mean Square Error(MSE) Parallel Codes Data Transfer Machine Code Robotic Guide Sensitivity Analysis Memory Consumption Model Portability Fine-Tuning Hardware And Software Platform Visual Attributes Local Metrics Llvm Compiler Time Consumption Genetic Algorithms(NSGA II) K-Means Gradient Boosting Learning Framework Dynamic Change Parallel Computing Comprehensive Evaluation Hardware Accelerator Burst Data Search Time Task Difficulty Flexible Accelerator Accuracy Loss Hybrid Vision Scale Model Small Footprint Design Practices Multiple Targets Heuristic Search Memory Cost Cost-Effective Memory Access Model Deployment Dl Model Arm Cortex Fast Deployment Numerical Stability Processing Performance Robotic Object Hybrid Transformer Performance Limits Exponential Approximation Linear Complexity Latency Constraints Tuning Method Resource Utilization Embedded Ai High Performance Parallel Processing Detection Systems(IDS) Bit Representation High Efficiency Optimal Method Best Kernel Floating-Point Operations Real-Time Processing Burst Time Asymmetric Multicore Processors Highly Dynamic Scheduling Technique Mobile Platform Optimal Number Edge Devices Mixed Precision Dynamic Graph Higher Efficiency Processing-In-Memory Performance Optimization Performance Bottleneck Workload Distribution Average Error Rate On-Chip Memory Path Planning Search Space Accelerator Design Heterogeneous Computing Systems Large-Scale Off-Chip Quantization Error Constrained Systems Human Intention Memory Bandwidth Neural Processor Visual Environment Cost Model Non-Linear Quantization Computational Complexity Processing Speed Visual Reasoning Complementary Methods Design Space Exploration Optimization Techniques Stable Performance Language Interaction Machine Learning Models Scheduling Plans Energy Overhead Chip Area Mathematical Reasoning Trade-Off De-Quantization High Throughput Low Latency Multi-Gpu Model Performance Process Data Specialized Hardware Embedded System Graph Partitioning Heterogeneous Processing Model-Based Real-World Deployment Different Operating Modes Truss Robot Centralized Processing Iot Systems

Articles (50)
Patents (1)
Technology Transfer (7)
R&D Reports (4)
Standards (0)
Monographs/Talk (0)

논문 검색결과
Type	Year	Title	Cited
Journal	2026	Towards an efficient dataflow-flexible accelerator by finding optimal dataflows of DNNs Hyunjun Kim, Whoi Ree Ha Future Generation Computer Systems, v.176, pp.1-10	0
Conference	2025	MLIR-ARX: Accelerator-Aware MLIR-to-RISC-V Compilation Integrated with an EDA Flow Yongin Kwon Conference on Neural Information Processing Systems (NeurIPS) 2025 : Workshop, pp.1-12
Conference	2025	A NUMA Aware Compiler Framework for Large Scale Mathematical Reasoning Inference on PCIe Based Multi Accelerator Systems JooHyoung Cha Conference on Neural Information Processing Systems (NeurIPS) 2025 : Workshop, pp.1-5
Journal	2025	Target-Aware Neural Network Execution via Compiler-Guided Pruning JooHyoung Cha, Taeho Kim IEEE Transactions on Mobile Computing, v.권호미정, pp.1-14	0
Conference	2025	Resource-Efficient On-Device Face Recognition Using K-Means Clustering 조현준 한국정보처리학회 학술 발표 대회 (추계) 2025, pp.131-132
Conference	2025	Design Practices and Lessons from Deploying On-device Vision-Language Interaction in Robotic Guide Dogs Jinse Kwon International Conference on Computer Vision Workshops (ICCVW) 2025, pp.2435-2444
Conference	2025	TriPlanNet: Triangle Path Planning Network for A Variable Truss Robot with Deep Learning Choonghan Lee International Conference on Computer Vision Workshops (ICCVW) 2025, pp.2476-2485
Conference	2025	Luthier: Bridging Auto-Tuning and Vendor Libraries for Efficient Deep Learning Inference Yongin Kwon International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) 2025, pp.1-23	0
Conference	2025	I-FlashAttention: Fully Integer Fused Attention for Efficient Vision Transformers Sehyeon Oh International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) 2025, pp.25-26	0
Conference	2025	Exploring the Trade-Offs: Quantization Methods, Task Difficulty, and Model Size in Large Language Models From Edge to Giant Jemin Lee International Joint Conference on Artificial Intelligence (IJCAI) 2025, pp.8113-8121	0
Conference	2025	Multi-level Machine Learning-Guided Autotuning for Efficient Code Generation on a Deep Learning Accelerator JooHyoung Cha ACM SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems (LCTES) 2025, pp.134-145
Journal	2025	Data Transmission Optimization for NPU System Scalability 오세현 대한임베디드공학회논문지, v.20, no.3, pp.125-130
Conference	2025	A Study on Enhancing Edge Device Performance through AI Computation Comparison between Raspberry Pi 5 and Hailo-8 and 8L 양병찬 한국정보처리학회 학술 발표 대회 (춘계) 2025, pp.106-107
Journal	2025	QuantuneV2: Compiler-based local metric-driven mixed precision quantization for practical embedded AI applications Jeongseok Kim, Jemin Lee Future Generation Computer Systems, v.166, pp.1-15	3
Conference	2025	Development of an On-Device AI-Based Autonomous Surveillance Robot 조현준 한국정보처리학회 학술 발표 대회 (춘계) 2025, pp.340-341
Conference	2025	Dynamic Layer-Specific Overlapping for Efficient LLM Inference on Resource-Constrained Systems Misun Yu International Symposium on Code Generation and Optimization (CGO) 2025, pp.1-3
Journal	2025	Optimizing Real-Time Object Detection in a Multi-Neural Processing Unit System Sehyeon Oh Sensors, v.25, no.5, pp.1-12	2
Conference	2025	A Lightweight Deep Learning Backend for Edge Devices Optimized for Limited C Library Environments Jeman Park International Symposium on Code Generation and Optimization (CGO) 2025, pp.1-3
Conference	2024	ML2Tuner: Efficient Code Tuning via Multi-Level Machine Learning Models JooHyoung Cha Conference on Neural Information Processing Systems (NeurIPS) 2024 : Workshop, pp.1-12
Conference	2024	Optimizing Real-Time Object Detection in a Multi NPU System with Double Buffering and Queue-Based Processing Sehyeon Oh International Conference on Artificial Intelligence Computing and Systems (AIComps) 2024, pp.1-5
Journal	2024	Q-HyViT: Post-Training Quantization of Hybrid Vision Transformers with Bridge Block Reconstruction for IoT Systems Jemin Lee IEEE Internet of Things Journal, v.11, no.22, pp.36384-36396	5
Conference	2024	PCIe-Based Multi-NPU Data Transmission Optimization 오세현 대한임베디드공학회 학술 대회 (추계) 2024, pp.1-3
Conference	2024	Evaluating DNN Throughput via Model Partitioning on Heterogeneous Systems 유미선 대한임베디드공학회 학술 대회 (추계) 2024, pp.20-22
Conference	2024	Utilization of Virtual Containers via Privilege Restriction in Embedded Systems 차주형 대한임베디드공학회 학술 대회 (추계) 2024, pp.278-280
Conference	2024	Visual Preference Inference: An Image Sequence-Based Preference Reasoning in Tabletop Object Manipulation Joonhyung Lee International Conference on Intelligent Robots and Systems (IROS) 2024, pp.1-8	0
Journal	2024	NEST-C: A deep learning compiler framework for heterogeneous computing systems with artificial intelligence accelerators Jeman Park ETRI Journal, v.46, no.5, pp.851-864	5
Conference	2024	Mixed Non-linear Quantization for Vision Transformers Gihwan Kim, Jemin Lee European Conference on Computer Vision (ECCV) 2024, pp.1-16
Conference	2024	LLMem: Estimating GPU Memory Usage for Fine-Tuning Pre-Trained LLMs Taeho Kim International Joint Conference on Artificial Intelligence (IJCAI) 2024, pp.6324-6332	3
Conference	2024	Visual Preference Inference: An Image Sequence-Based Preference Reasoning in Tabletop Object Manipulation Joonhyung Lee International Conference on Robotics and Automation (ICRA) 2024 : Workshop, pp.1-5
Journal	2024	Performance Analysis of Deep Learning Accelerator for Edge Inference 박시형 전자공학회논문지, v.61, no.1, pp.23-26
Conference	2023	ACLTuner: A Profiling-Driven Fast Tuning to Optimized Deep Learning Inference Yongin Kwon Conference on Neural Information Processing Systems (NeurIPS) 2023 : Workshop, pp.1-12
Journal	2023	Design and Verification of a Common Interface for Proprietary NPU Code Generation in General-Purpose AI Compilers 이제민 전자공학회논문지, v.60, no.10, pp.29-32
Journal	2023	Profile-based Optimization for Deep Learning on Heterogeneous Multi-core CPUs 차주형 전자공학회논문지, v.60, no.7, pp.40-49
Journal	2023	Tensor slicing and optimization for multicore NPUs Rafael Sousa Journal of Parallel and Distributed Computing, v.175, pp.66-79	9
Journal	2023	PartitionTuner: An operator scheduler for deep‐learning compilers supporting multiple heterogeneous processing units Misun Yu ETRI Journal, v.45, no.2, pp.318-328	4
Conference	2023	A Study on the Performance Improvement of Korean Math Word Problem Solving Using Labeled-Edge Information 여상엽 한국통신학회 종합 학술 발표회 (동계) 2023, pp.1063-1064
Conference	2023	A Study on Scheduler for High Throughput in Heterogeneous Computing and Multiple Deep Learning Models 차주형 한국통신학회 종합 학술 발표회 (동계) 2023, pp.1071-1072
Conference	2022	Profiling-based ArmCL Optimal Schedule Search for Single-ISA Heterogeneous Multi-Core Architectures 차주형 대한전자공학회 학술 대회 (추계) 2022, pp.300-304
Conference	2022	CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution Taeho Kim European Conference on Computer Vision (ECCV) 2022 (LNCS 13680), pp.651-667	8
Journal	2022	Quantune: Post-training Quantization of Convolutional Neural Networks using Extreme Gradient Boosting for Fast Deployment Jemin Lee Future Generation Computer Systems, v.132, pp.124-135	19
Conference	2021	Performance Improvement of Neural-net Computation using Branch-Parallel Execution on Heterogeneous Processing Units 유미선 대한임베디드공학회 학술 대회 (추계) 2021, pp.257-260
Conference	2021	Mixed-Precision Quantization via Glow Compiler Extension 이제민 대한임베디드공학회 학술 대회 (추계) 2021, pp.205-207
Conference	2021	Development of Scalable HLS based Deep Learning Accelerator 권용인 대한임베디드공학회 학술 대회 (추계) 2021, pp.247-249
Conference	2020	Accuracy Improvement of Quantized Neural Network Model Based on Operator Fusion for NPU 이제민 대한임베디드공학회 학술 대회 (추계) 2020, pp.1-4
Conference	2020	An Automatic C/C++ code Generation Framework for Deep Neural Network Deployment on Embedded Devices 유미선 한국 컴퓨터 종합 학술 대회 2020, pp.691-693
Conference	2020	Accelerating the Inference of CNN using Target-Independent Operator Fusion based on the Glow Compiler 이제민 IEMEK Symposium on Embedded Technology (ISET) 2020, pp.64-66
Conference	2020	Profiling-based Graph Partitioning System for Multi-accelerator Deep Learning Compilers 유미선 IEMEK Symposium on Embedded Technology (ISET) 2020, pp.67-70
Conference	2020	Accelerating Object Detection for CPU using the Glow Compiler 이제민 한국 컴퓨터 종합 학술 대회 2020, pp.212-214
Conference	2020	Performance Improvement by Extending ISA of a HLS Based Deep Learning Accelerator 권용인 IEMEK Symposium on Embedded Technology (ISET) 2020, pp.46-49
Conference	2020	Tiling and Scheduling: Machine code optimization for deep learning accelerators 권용인 한국 컴퓨터 종합 학술 대회 2020, pp.1-3

ETRI-Knowledge Sharing Plaform

BROWSE

Researchers

Detail

ETRI