À¯´ÏƼ ML-Agents´Â °ÔÀÓ ¿£ÁøÀÎ À¯´ÏƼ¸¦ ÅëÇØ Á¦ÀÛÇÑ ½Ã¹Ä·¹ÀÌ¼Ç È¯°æÀ» °ÈÇнÀÀ» À§ÇÑ È¯°æÀ¸·Î ¸¸µé¾îÁÖ´Â °í¸¶¿î µµ±¸ÀÌ´Ù. ÇÏÁö¸¸ ¾ÆÁ÷±îÁöµµ ML-Agents, ±×Áß¿¡¼µµ ƯÈ÷ ML-Agents 2.0 ÀÌÈÄÀÇ ¹öÀüÀ» ´Ù·ç´Â Âü°í ÀÚ·á°¡ ¸¹Áö ¾Ê±â ¶§¹®¿¡ ML-Agents¸¦ »ç¿ëÇÏ´Â µ¥ ¾î·Á¿òÀÌ ¸¹¾Ò´Ù. ÀÌ Ã¥Àº À¯´ÏƼ, ML-Agents, ½ÉÃþ°ÈÇнÀ µî À¯´ÏƼ ML-Agents¸¦ »ç¿ëÇÏ´Â µ¥ ÇÊ¿äÇÑ ´Ù¾çÇÑ ³»¿ëÀ» ´Ù·é´Ù. ¶ÇÇÑ ÀÌ Ã¥Àº 2020³â Ãâ°£µÈ ¡ìÅÙ¼ÇÃ·Î¿Í À¯´ÏƼ ML-Agents·Î ¹è¿ì´Â °ÈÇнÀ¡íÀÇ °³Á¤ÆÇÀ¸·Î ÃֽŹöÀüÀÇ ML-Agents¿¡ ´ëÇÑ ³»¿ëÀ» ´Ù·ç°í ÀÖ´Ù.
ÇѾç´ëÇб³ ¹Ì·¡ÀÚµ¿Â÷°øÇаú¿¡¼ ¹Ú»çÇÐÀ§¸¦ ÃëµæÇßÀ¸¸ç ÇöÀç īī¿À¿¡¼ AI ¿£Áö´Ï¾î·Î ÀÏÇϰí ÀÖ´Ù. °ÈÇнÀ °ü·Ã ÆäÀ̽ººÏ ±×·ìÀÎ Reinforcement Learning KoreaÀÇ ¿î¿µÁøÀ¸·Î Ȱµ¿Çϰí ÀÖÀ¸¸ç À¯´ÏƼ ÄÚ¸®¾Æ¿¡¼ °øÀÎÇÑ À¯´ÏƼ Àü¹®°¡ ±×·ìÀÎ Unity Masters 3~5±â·Î Ȱµ¿Çß´Ù.
¢Ã 1Àå: °ÈÇнÀÀÇ °³¿ä
1.1 °ÈÇнÀÀ̶õ?
___1.1.1 ±â°èÇнÀÀ̶õ?
___1.1.2 °ÈÇнÀÀÇ ¼º°ú
1.2 °ÈÇнÀÀÇ ±âÃÊ ¿ë¾î
1.3 °ÈÇнÀÀÇ ±âÃÊ ÀÌ·Ð
___1.3.1 º§¸¸ ¹æÁ¤½Ä
___1.3.2 ŽÇè(exploration)°ú ÀÌ¿ë(exploitation)
¢Ã 2Àå: À¯´ÏƼ ML_Agents »ìÆìº¸±â
2.1 À¯´ÏƼ¿Í ML-Agents
___2.1.1 À¯´ÏƼ
___2.1.2 ML-Agents
2.2 À¯´ÏƼ ¼³Ä¡ ¹× ±âÃÊ Á¶ÀÛ¹ý
___2.2.1 À¯´ÏƼ Çãºê ´Ù¿î·Îµå ¹× ¼³Ä¡
___2.2.2 À¯´ÏƼ ¶óÀ̼±½º Ȱ¼ºÈ
___2.2.3 À¯´ÏƼ ¿¡µðÅÍ ¼³Ä¡
___2.2.4 À¯´ÏƼ ÇÁ·ÎÁ§Æ® »ý¼º
___2.2.5 À¯´ÏƼ ÀÎÅÍÆäÀ̽º
___2.2.6 À¯´ÏƼÀÇ ±âÃÊÀûÀÎ Á¶ÀÛ
2.3 ML-Agents ¼³Ä¡
___2.3.1 ML-Agents ÆÄÀÏ ³»·Á¹Þ±â
___2.3.2 À¯´ÏƼ¿¡ ML-Agents ¼³Ä¡Çϱâ
___2.3.3 ML-Agents ÆÄÀÌ½ã ÆÐŰÁö ¼³Ä¡Çϱâ
2.4 ML-AgentsÀÇ ±¸¼º ¿ä¼Ò
___2.4.1 Behavior Parameters
___2.4.2 Agent Script
___2.4.3 Decision Requester, Model Overrider
___2.4.4 ȯ°æ ºôµåÇϱâ
2.5 mlagents-learnÀ» ÀÌ¿ëÇØ ML-Agents »ç¿ëÇϱâ
___2.5.1 ML-Agents¿¡¼ Á¦°øÇÏ´Â °ÈÇнÀ ¾Ë°í¸®Áò
___2.5.2 ML-Agents¿¡¼ Á¦°øÇÏ´Â ÇнÀ ¹æ½Ä
___2.5.3 PPO ¾Ë°í¸®ÁòÀ» ÀÌ¿ëÇÑ 3DBall ȯ°æ ÇнÀ
2.6 Python-API¸¦ ÀÌ¿ëÇØ ML-Agents »ç¿ëÇϱâ
___2.6.1 Python-API¸¦ ÅëÇÑ ¿¡ÀÌÀüÆ® ·£´ý Á¦¾î
¢Ã 3Àå: ±×¸®µå¿ùµå ȯ°æ ¸¸µé±â
3.1 ÇÁ·ÎÁ§Æ® ½ÃÀÛÇϱâ
3.2 ±×¸®µå¿ùµå ½ºÅ©¸³Æ® ¼³¸í
3.3 º¤ÅÍ °üÃø Ãß°¡ ¹× ȯ°æ ºôµå
3.4 ¹ø¿Ü: ÄÚµå ÃÖÀûÈ Çϱâ
¢Ã 4Àå: Deep Q Network(DQN)
4.1 DQN ¾Ë°í¸®ÁòÀÇ ¹è°æ
___4.1.1 °¡Ä¡ ±â¹Ý °ÈÇнÀ
___4.1.2 DQN ¾Ë°í¸®ÁòÀÇ °³¿ä
4.2 DQN ¾Ë°í¸®ÁòÀÇ ±â¹ý
___4.2.1 °æÇè ¸®Ç÷¹ÀÌ(experience replay)
___4.2.2 Ÿ±ê ³×Æ®¿öÅ©(target network)
4.3 DQN ÇнÀ
4.4 DQN ÄÚµå
___4.4.1 ¶óÀ̺귯¸® ºÒ·¯¿À±â ¹× ÆÄ¶ó¹ÌÅÍ °ª ¼³Á¤
___4.4.2 Model Ŭ·¡½º
___4.4.3 Agent Ŭ·¡½º
___4.4.4 Main ÇÔ¼ö
___4.4.5 ÇнÀ °á°ú
¢Ã 5Àå: µå·Ð ȯ°æ ¸¸µé±â
5.1 A2C ¾Ë°í¸®ÁòÀÇ °³¿ä
5.2 ¾×ÅÍ-Å©¸®Æ½ ³×Æ®¿öÅ©ÀÇ ±¸Á¶
5.3 A2C ¾Ë°í¸®ÁòÀÇ ÇнÀ °úÁ¤
5.4 A2CÀÇ ÀüüÀûÀÎ ÇнÀ °úÁ¤
5.5 A2C ÄÚµå
___5.5.1 ¶óÀ̺귯¸® ºÒ·¯¿À±â ¹× ÆÄ¶ó¹ÌÅÍ °ª ¼³Á¤
___5.5.2 Model Ŭ·¡½º
___5.5.3 Agent Ŭ·¡½º
___5.5.4 Main ÇÔ¼ö
5.5.5 ÇнÀ °á°ú
¢Ã 6Àå: Advantage Actor Critic(A2C)
6.1 ÇÁ·ÎÁ§Æ® ½ÃÀÛÇϱâ
6.2 µå·Ð ¿¡¼Â °¡Á®¿À±â & ¿ÀºêÁ§Æ® Ãß°¡
___6.2.1 ¿¡¼Â½ºÅä¾î¿¡¼ µå·Ð ¿¡¼Â ³»·Á¹Þ±â
___6.2.2 µå·Ð ȯ°æ Á¦ÀÛÇϱâ
6.3 ½ºÅ©¸³Æ® ¼³¸í
___6.3.1 DroneSetting ½ºÅ©¸³Æ®
___6.3.2. DroneAgent ½ºÅ©¸³Æ®
6.4 µå·Ð ȯ°æ ½ÇÇà ¹× È¯°æ ºôµå
¢Ã 7Àå: Deep Deterministic Policy Gradient(DDPG)
7.1 DDPG ¾Ë°í¸®ÁòÀÇ °³¿ä
7.2 DDPG ¾Ë°í¸®ÁòÀÇ ±â¹ý
___7.2.1 °æÇè ¸®Ç÷¹ÀÌ(experience replay)
___7.2.2 Ÿ±ê ³×Æ®¿öÅ©(target network)
___7.2.3 ¼ÒÇÁÆ® Ÿ±ê ¾÷µ¥ÀÌÆ®(soft target update)
___7.2.4 OU ³ëÀÌÁî(Ornstein Uhlenbeck Noise)
7.3 DDPG ÇнÀ
___7.3.1 Å©¸®Æ½ ³×Æ®¿öÅ© ¾÷µ¥ÀÌÆ®
___7.3.2 ¾×ÅÍ ³×Æ®¿öÅ© ¾÷µ¥ÀÌÆ®
7.4 DDPG ÄÚµå
___7.4.1 ¶óÀ̺귯¸® ºÒ·¯¿À±â ¹× ÆÄ¶ó¹ÌÅÍ °ª ¼³Á¤
___7.4.2 OU Noise Ŭ·¡½º
___7.4.3 Actor Ŭ·¡½º
___7.4.4 Critic Ŭ·¡½º
___7.4.5 Agent Ŭ·¡½º
___7.4.6 Main ÇÔ¼ö
___7.4.7 ÇнÀ °á°ú
¢Ã 8Àå: īƮ·¹ÀÌ½Ì È¯°æ ¸¸µé±â
8.1 ÇÁ·ÎÁ§Æ® ½ÃÀÛÇϱâ
8.2 īƮ·¹ÀÌ½Ì È¯°æ ±¸¼ºÇϱâ
8.3 ½ºÅ©¸³Æ® ÀÛ¼º ¹× ºôµåÇϱâ
¢Ã 9Àå: Behavioral Cloning(BC)
9.1 Behavioral Cloning ¾Ë°í¸®ÁòÀÇ °³¿ä
9.2 Behavioral Cloning ¾Ë°í¸®ÁòÀÇ ±â¹ý
___9.2.1 º¸»óÀÌ À½¼öÀÎ µ¥ÀÌÅÍ Á¦¿ÜÇϱâ
9.3 Behavioral Cloning ÇнÀ
9.4 Behavioral Cloning ¾Ë°í¸®Áò ÄÚµå
___9.4.1 ¶óÀ̺귯¸® ºÒ·¯¿À±â ¹× ÆÄ¶ó¹ÌÅÍ °ª ¼³Á¤
___9.4.2 Model Ŭ·¡½º
___9.4.3 Agent Ŭ·¡½º
___9.4.4 Main ÇÔ¼ö
___9.4.5 ÇнÀ °á°ú
9.5 ml-agentsÀÇ ³»Àå Imitation Learning »ç¿ë
___9.5.1 ML-Agents¿¡¼ Á¦°øÇÏ´Â Behavioral Cloning ¾Ë°í¸®Áò
___9.5.2 ML-Agents¿¡¼ Á¦°øÇÏ´Â GAIL ¾Ë°í¸®Áò
___9.5.3 ¸ð¹æÇнÀÀ» À§ÇÑ Config ÆÄÀÏ ¼³Á¤
___9.5.4 ml-agent¿¡¼ÀÇ ¸ð¹æÇнÀ °á°ú
¢Ã 10Àå: ¸¶¹«¸®
10.1 ±âÃÊÆí ³»¿ë Á¤¸®
10.2 Ãß°¡ ÇнÀ ÀÚ·á
___10.2.1 À¯´ÏƼ
___10.2.2 À¯´ÏƼ ML-Agents
___10.2.3 °ÈÇнÀ
10.3 ÀÀ¿ëÆí¿¡¼ »ìÆìº¼ ³»¿ë