学习记录

llvm是一款跨平台的编译器,我们可以用它来优化编译代码,让代码执行的更加高效,而ollvm是将代码进行编译混淆,增加代码的不可读性。它的官方网站是http://llvm.org/

llvm中有一个中间语言是LLVM IR,这里看一段官方的说明

1
The LLVM Core libraries provide a modern source- and target-independent optimizer, along with code generation support for many popular CPUs (as well as some less common ones!) These libraries are built around a well specified code representation known as the LLVM intermediate representation ("LLVM IR"). The LLVM Core libraries are well documented, and it is particularly easy to invent your own language (or port an existing compiler) to use LLVM as an optimizer and code generator.

大概意思就是,LLVM IR是核心库的一个中间的语言,主要用来跨平台针对不同cpu的。在跨平台的时候我们可以这样编译,c源码编译成IR文件,然后IR是平台无关性的,我们在linux下编译的IR文件可以拿到win上面,然后再将IR文件编译成当前cpu的汇编。然后再由汇编编译成可执行二进制。这样就达到了跨平台的目的。

再看看另外一个llvm中的重要部分clang,先看一段官方说明

1
Clang is an "LLVM native" C/C++/Objective-C compiler, which aims to deliver amazingly fast compiles, extremely useful error and warning messages and to provide a platform for building great source level tools. The Clang Static Analyzer and clang-tidy are tools that automatically find bugs in your code, and are great examples of the sort of tools that can be built using the Clang frontend as a library to parse C/C++ code.

clang是llvm中的一个c编译器,可以支持大多数和c相关的语言编译,比如C, C++, Objective C/C++, OpenCL, CUDA, and RenderScript。用gcc可以编译的,基本clang都可以。

然后看看llvm的编译,先放上官方的编译说明文档地址

http://llvm.org/docs/GettingStarted.html#getting-the-source-code-and-building-llvm

然后就可以看着说明文档编译

1
2
3
4
5
6
7
8
9
10
11
12
13
Checkout LLVM (including related subprojects like Clang):

git clone https://github.com/llvm/llvm-project.git
Or, on windows, git clone --config core.autocrlf=false https://github.com/llvm/llvm-project.git
Configure and build LLVM and Clang:

cd llvm-project

mkdir build

cd build

cmake -G <generator> [options] ../llvm

这里要注意的是cmake时的参数,第一个参数 这里是编译方式,默认是使用make,也可以设置使用Ninja。这里看官方的说明即可

1
2
3
4
Ninja — for generating Ninja build files. Most llvm developers use Ninja.
Unix Makefiles — for generating make-compatible parallel makefiles.
Visual Studio — for generating Visual Studio projects and solutions.
Xcode — for generating Xcode projects.

另外一个参数 可以设置编译的类型是debug还是release,编译哪些子项。

1
2
3
4
5
6
7
8
9
-DLLVM_ENABLE_PROJECTS='...' — semicolon-separated list of the LLVM subprojects you’d like to additionally build. Can include any of: clang, clang-tools-extra, libcxx, libcxxabi, libunwind, lldb, compiler-rt, lld, polly, or debuginfo-tests.

For example, to build LLVM, Clang, libcxx, and libcxxabi, use -DLLVM_ENABLE_PROJECTS="clang;libcxx;libcxxabi".

-DCMAKE_INSTALL_PREFIX=directory — Specify for directory the full pathname of where you want the LLVM tools and libraries to be installed (default /usr/local).

-DCMAKE_BUILD_TYPE=type — Valid options for type are Debug, Release, RelWithDebInfo, and MinSizeRel. Default is Debug.

-DLLVM_ENABLE_ASSERTIONS=On — Compile with assertion checks enabled (default is Yes for Debug builds, No for all other build types).

默认不填写编译类型是debug,下面是一个debug的编译

cmake -G Ninja -DCMAKE_BUILD_TYPE=Debug -DLLVM_ENABLE_PROJECTS="clang" ../llvm

如果要编译release直接修改编译类型就可以了。这里完成之后。直接ninja -j4就开始编译了。如果不想完整编译可以直接ninja <模块名>这样能进行编译,比如只编译clang就是ninja clang

然后写个例子试一下clang的编译和中间语言IR

先写一个简单的hello world代码

1
2
3
4
5
#include <stdio.h>
int main(int argc,char** argv){
printf("hello world!!");
return 0;
}

然后把这段代码用clang编译成IR中间语言

1
clang -emit-llvm -S main.c -o main.ll

编译出来的这个main.ll就是IR中间语言,是和cpu无关性的。下面是IR的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
; ModuleID = 'main.c'
source_filename = "main.c"
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.15.0"

@.str = private unnamed_addr constant [14 x i8] c"hello world!!\00", align 1

; Function Attrs: noinline nounwind optnone ssp uwtable
define i32 @main(i32 %argc, i8** %argv) #0 {
entry:
%retval = alloca i32, align 4
%argc.addr = alloca i32, align 4
%argv.addr = alloca i8**, align 8
store i32 0, i32* %retval, align 4
store i32 %argc, i32* %argc.addr, align 4
store i8** %argv, i8*** %argv.addr, align 8
%call = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([14 x i8], [14 x i8]* @.str, i64 0, i64 0))
ret i32 0
}

declare i32 @printf(i8*, ...) #1

attributes #0 = { noinline nounwind optnone ssp uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+cx8,+fxsr,+mmx,+sahf,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+cx8,+fxsr,+mmx,+sahf,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.module.flags = !{!0, !1, !2}
!llvm.ident = !{!3}

!0 = !{i32 2, !"SDK Version", [3 x i32] [i32 10, i32 15, i32 6]}
!1 = !{i32 1, !"wchar_size", i32 4}
!2 = !{i32 7, !"PIC Level", i32 2}
!3 = !{!"clang version 9.0.1 "}

然后将这个IR文件转成bcode二进制

1
llvm-as main.ll -o main.bc

同时也可以直接将c文件生成bc文件

1
clang -emit-llvm -c main.c -o main.bc

这个bc二进制文件是可以再重新转回IR文件的。

1
llvm-dis main.dc -o main.ll

也可以直接bc文件转换成当前cpu的汇编文件

1
llc main.bc -o main.s

这个时候的就是当前cpu的汇编文件了,下面列下main.s

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
    .section    __TEXT,__text,regular,pure_instructions
.macosx_version_min 10, 15 sdk_version 10, 15, 6
.globl _main ## -- Begin function main
.p2align 4, 0x90
_main: ## @main
.cfi_startproc
## %bb.0: ## %entry
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset %rbp, -16
movq %rsp, %rbp
.cfi_def_cfa_register %rbp
subq $16, %rsp
movl $0, -8(%rbp)
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
leaq L_.str(%rip), %rdi
movb $0, %al
callq _printf
xorl %eax, %eax
addq $16, %rsp
popq %rbp
retq
.cfi_endproc
## -- End function
.section __TEXT,__cstring,cstring_literals
L_.str: ## @.str
.asciz "hello world!!"
.subsections_via_symbols

最后就可以直接生成可执行二进制了

1
clang main.s -o main

然后这个main就可以直接运行了,当文件还是IR文件或者是bc二进制时,也是可以使用lli来运行的

1
2
lli main.ll
lli main.bc