AI 번역기 만들기 TinyLLaMa vs Phi-2

모카쨩 2025. 5. 26. 09:16

https://www.youtube.com/watch?v=MyTmlXaMCvk

라마샵 LLaMaSharp 사용법은 아래 링크

https://wmmu.tistory.com/entry/%EB%9D%BC%EB%A7%88%EC%83%B5-LLaMaSharp-%EC%82%AC%EC%9A%A9%EB%B2%95

라마샵 LLaMaSharp 사용법

https://youtu.be/5WpHA-wr_l4 이름에도 보다시피 LLM, 즉 AI를 C#에서 사용하기 위한 SDK이다 파이썬이 일반사용자용 프로그램만들때 거지같은 점이 있기 때문에(내가 파이썬을 쓰레기 언어라고 하는 이유

wmmu.tistory.com

저번에 라마샵 사용법을 정리했었다

그런데 아시다시피 요새 GPT가 꽉 붙들고 있어서 챗봇은 만드는 의미가 없다

다행스럽게도 챗봇은 사용법중 하나일뿐 프롬프트에 따라 다양한 결과물을 만들수 있다

저번에 TinyLLaMa를 사용했으니 이번엔 Phi-2를 받아보자
Phi-2는 아래 링크에서 받을수 있다

https://huggingface.co/TheBloke/phi-2-GGUF

TheBloke/phi-2-GGUF · Hugging Face

Phi 2 - GGUF Description This repo contains GGUF format model files for Microsoft's Phi 2. About GGUF GGUF is a new format introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp. Here i

huggingface.co

저번과 똑같이 Models 폴더에 넣어주면 된다

이번엔 둘다 돌려서 성능테스트도 해볼것이다

나는 번역기능이 필요해서 번역기로 만들었다

저번에 사용한 챗봇 코드는 사용할수 없다

프롬프트 클래스 자체가 챗봇용으로 만들어졌기 때문

내가 사용한 코드는 아래에 올려두었다

https://gist.github.com/ahzkwid/fe5a98c9db034cbec4d2c0aa710c9b95

LLaMaSharpSample2

LLaMaSharpSample2. GitHub Gist: instantly share code, notes, and snippets.

gist.github.com

using LLama;
using LLama.Common;
using LLama.Sampling;
using System.Diagnostics;

namespace LLaMaSharpSample2
{
    internal static class Translator
    {
        private static StatelessExecutor _executor;
        enum Model
        {
            TinyLLaMa, Phi2
        }
        static Model model= Model.TinyLLaMa;
        static Translator()
        {
            var modelPath = "Models/tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf";
            if (model==Model.Phi2)
            {
                modelPath = "Models/phi-2.Q4_K_M.gguf";
            }
            var modelParams = new ModelParams(modelPath)
            {
                ContextSize = 2048
            };
            var weights = LLamaWeights.LoadFromFile(modelParams);
            _executor = new StatelessExecutor(weights, modelParams);


            Debug.WriteLine($"model: {model}");
            Debug.WriteLine("");
        }
        public static string TranslateSync(string input)
        {
            return Task.Run(() => Translate(input)).Result;
        }
        public static async Task<string> Translate(string input)
        {
            Debug.WriteLine($"input: {input}");

            var language = "English";
            //language = "Korean";
            string prompt = $"Translate the following text into {language}\nText: {input}\n{language}: ";
            if (model == Model.Phi2)
            {
                prompt = $"<|system|>Translate from Text to {language}\n<|Text|>{input}\n<|Translate|>";
            }

            var inferParams = new InferenceParams
            {
                //MaxTokens = 128,
                MaxTokens = input.Length*2,
                AntiPrompts = new List<string> { "\n"},
            };

            StringBuilder result = new StringBuilder();
            Debug.Write("output: ");
            await foreach (var token in _executor.InferAsync(prompt, inferParams))
            {
                Debug.Write(token);
                result.Append(token);
            }

            Debug.WriteLine("");
            return result.ToString().Trim();
        }

    }
}

이번엔 StatelessExecutor를 사용했다

저번엔 InteractiveExecutor를 사용하여 대화내용을 기억하게 했지만

StatelessExecutor는 매번 초기상태를 유지하여 번역기에 적합하다

프롬프트도 InferAsync에 string프롬프트를 사용하면 프롬프트를 세밀하게 조정할수 있다

주의점은 모델마다 프롬프트를 미묘하게 다르게 조정해야 한다

잘 보면 Phi2는 <|text|>를 사용하고 있고 tinyllama는 Text:를 사용하고 있다

재밌는점은 <|text|>는 원래 tinyllama에 권장되는 프롬프트이지만 저렇게 쳐야 더 좋은 결과물이 나왔다

반대로 tinyllama의 프롬프트를 phi2에 넣어도 원하는 결과물이 나오지 않는다

결과물

https://www.youtube.com/watch?v=MyTmlXaMCvk

Phi-2가 더 무겁지만 TinyLLaMa가 더 좋은 결과물을 만들어 주었다

오해하면 안되는게 두녀석은 원래 번역용으로 개발된것이 아니므로

이것만으로 모델자체의 좋고 나쁘다를 결정할수는 없다는것이다

그저 번역용으로 한정지었을때만 TinyLLaMa를 써야한다는것만 알뿐