Swift 类似HandyJSON解析Struct
HandyJSON
HandyJSON是阿里开发的一个在swift上把JSON数据转化为对应model的框架。与其他流行的Swift JSON库相比,HandyJSON的特点是,它支持纯swift类,使用也简单。它反序列化时(把JSON转换为Model)不要求Model从NSObject继承(因为它不是基于KVC机制),也不要求你为Model定义一个Mapping函数。只要你定义好Model类,声明它服从HandyJSON协议,HandyJSON就能自行以各个属性的属性名为Key,从JSON串中解析值。不过因为HandyJSON是基于swift的metadata来做的,如果swift的metadata的结构改了,HandyJSON可能就直接不能用了。当然阿里一直在维护这个框架,swift的源码有变化,相信框架也是相对于有改变的。
HandyJSON的github
从源码解析Struct
获取TargetStructMetadata
由于HandyJSON是基于swift的metadata来做的,说道解析解析struct,那就不得不去了解metadata。接下来,我们会从源码的角度去寻找metadata。
首先,我们从源码Metadata.h中搜索StructMetadata相关信息,会发现其真正类型是TargetStructMetadata。
using StructMetadata = TargetStructMetadata<InProcess>;
接着,我们查看TargetStructMetadata的结构会发现,TargetStructMetadata继承自TargetValueMetadata,TargetValueMetadata继承自TargetMetadata。
struct TargetStructMetadata : public TargetValueMetadata<Runtime> {
struct TargetValueMetadata : public TargetMetadata<Runtime> {
那么,我们就可以通过这个继承链去还原TargetStructMetadata的结构。
从代码中我们可以看出,TargetStructMetadata的第一个属性是Kind,除了这个属性还有一个description,用于记录描述文件。
struct TargetMetadata {
......
private:
/// The kind. Only valid for non-class metadata; getKind() must be used to get
/// the kind value.
StoredPointer Kind;
......
}
struct TargetValueMetadata : public TargetMetadata<Runtime> {
using StoredPointer = typename Runtime::StoredPointer;
TargetValueMetadata(MetadataKind Kind,
const TargetTypeContextDescriptor<Runtime> *description)
: TargetMetadata<Runtime>(Kind), Description(description) {}
//用于记录元数据的描述
/// An out-of-line description of the type.
TargetSignedPointer<Runtime, const TargetValueTypeDescriptor<Runtime> * __ptrauth_swift_type_descriptor> Description;
......
}
这样我们就可以得到TargetStructMetadata的结构为
struct TargetStructMetadata {
// StoredPointer Kind; 64位系统下 using StoredPointer = uint64_t; 即为Int
var kind: Int
//暂且先定义为UnsafeMutablePointer,后面会分析typeDescriptor的结构 T就是泛型
var typeDescriptor: UnsafeMutablePointer<T>
}
获取TargetStructDescriptor
接下来我们解析Description的相关信息。从源码中可得TargetStructDescriptor是Description的结构。
const TargetStructDescriptor<Runtime> *getDescription() const {
return llvm::cast<TargetStructDescriptor<Runtime>>(this->Description);
}
我们查找TargetStructDescriptor可以得到,其继承自TargetValueTypeDescriptor,含有两个属性NumFields(记录属性的count)和FieldOffsetVectorOffset(记录属性在metadata中的偏移量)
class TargetStructDescriptor final
: public TargetValueTypeDescriptor<Runtime>,
public TrailingGenericContextObjects<TargetStructDescriptor<Runtime>,
TargetTypeGenericContextDescriptorHeader,
/*additional trailing objects*/
TargetForeignMetadataInitialization<Runtime>,
TargetSingletonMetadataInitialization<Runtime>,
TargetCanonicalSpecializedMetadatasListCount<Runtime>,
TargetCanonicalSpecializedMetadatasListEntry<Runtime>,
TargetCanonicalSpecializedMetadatasCachingOnceToken<Runtime>> {
......
/// The number of stored properties in the struct.
/// If there is a field offset vector, this is its length.
uint32_t NumFields; //记录属性的count
/// The offset of the field offset vector for this struct's stored
/// properties in its metadata, if any. 0 means there is no field offset
/// vector.
uint32_t FieldOffsetVectorOffset; //记录属性在metadata中的偏移量
TargetValueTypeDescriptor继承自TargetTypeContextDescriptor,TargetTypeContextDescriptor含有三个属性:Name(类型的名称)、AccessFunctionPtr(指向此类型的元数据访问函数的指针)和Fields(指向类型的字段描述符的指针)。
class TargetValueTypeDescriptor
: public TargetTypeContextDescriptor<Runtime> {
public:
static bool classof(const TargetContextDescriptor<Runtime> *cd) {
return cd->getKind() == ContextDescriptorKind::Struct ||
cd->getKind() == ContextDescriptorKind::Enum;
}
};
class TargetTypeContextDescriptor
: public TargetContextDescriptor<Runtime> {
public:
/// The name of the type.
// 类型的名称
TargetRelativeDirectPointer<Runtime, const char, /*nullable*/ false> Name;
/// A pointer to the metadata access function for this type.
///
/// The function type here is a stand-in. You should use getAccessFunction()
/// to wrap the function pointer in an accessor that uses the proper calling
/// convention for a given number of arguments.
// 指向此类型的元数据访问函数的指针
TargetRelativeDirectPointer<Runtime, MetadataResponse(...),
/*Nullable*/ true> AccessFunctionPtr;
/// A pointer to the field descriptor for the type, if any.
// 指向类型的字段描述符的指针
TargetRelativeDirectPointer<Runtime, const reflection::FieldDescriptor,
/*nullable*/ true> Fields;
......
}
TargetTypeContextDescriptor又继承自基类TargetContextDescriptor,TargetContextDescriptor包含两个属性:Flags(用于表示描述context的标志,包含kind和version)和Parent(用于表示父类的context,如果是在顶层,则表示没有父类,则为NULL)。
/// Base class for all context descriptors.
template<typename Runtime>
struct TargetContextDescriptor {
/// Flags describing the context, including its kind and format version.
// 用于表示描述context的标志,包含kind和version
ContextDescriptorFlags Flags;
/// The parent context, or null if this is a top-level context.
// 用于表示父类的context,如果是在顶层,则表示没有父类,则为NULL
TargetRelativeContextPointer<Runtime> Parent;
......
}
从这里开始,TargetStructDescriptor就已经明了了,我们就可以写出TargetStructDescriptor的相关结构,同时修正TargetStructMetadata中的泛型T。
struct TargetStructMetadata {
var kind: Int
var typeDescriptor: UnsafeMutablePointer<TargetStructDescriptor>
}
struct TargetStructDescriptor {
// 用于表示描述context的标志,包含kind和version
var flags: Int32 // ContextDescriptorFlags Int32
// 用于表示父类的context,如果是在顶层,则表示没有父类,则为NULL
var parent: TargetRelativeContextPointer<UnsafeRawPointer> // Relative 相对地址
// 类型的名称
var name: TargetRelativeDirectPointer<CChar> // Relative 相对地址
// 指向此类型的元数据访问函数的指针
var accessFunctionPointer: TargetRelativeDirectPointer<UnsafeRawPointer> // Relative 相对地址
// 指向类型的字段描述符的指针
var fieldDescriptor: TargetRelativeDirectPointer<FieldDescriptor> // Relative 相对地址
// 记录属性的count
var numFields: Int32
// 记录属性在metadata中的偏移量
var fieldOffsetVectorOffset: Int32
}
// 下面是一些属性的类型解析
/// Common flags stored in the first 32-bit word of any context descriptor.
// flags 就是 Int32
struct ContextDescriptorFlags {
private:
uint32_t Value;
}
实现TargetRelativeDirectPointer
对于相对地址TargetRelativeDirectPointer,我们从源码中搜索TargetRelativeDirectPointer可得出TargetRelativeDirectPointer就是RelativeDirectPointer。
template <typename Runtime, typename Pointee, bool Nullable = true>
using TargetRelativeDirectPointer
= typename Runtime::template RelativeDirectPointer<Pointee, Nullable>;
接着在RelativePointer.h找到RelativeDirectPointer,发现RelativeDirectPointer继承自基类RelativeDirectPointerImpl,其包含一个属性RelativeOffset(偏移量)。并且其含有通过偏移量获取真实内存的方法。
template <typename T, bool Nullable = true, typename Offset = int32_t,
typename = void>
class RelativeDirectPointer;
/// A direct relative reference to an object that is not a function pointer.
// offset传入Int32
template <typename T, bool Nullable, typename Offset>
class RelativeDirectPointer<T, Nullable, Offset,
typename std::enable_if<!std::is_function<T>::value>::type>
: private RelativeDirectPointerImpl<T, Nullable, Offset>
{
......
}
/// A relative reference to a function, intended to reference private metadata
/// functions for the current executable or dynamic library image from
/// position-independent constant data.
template<typename T, bool Nullable, typename Offset>
class RelativeDirectPointerImpl {
private:
/// The relative offset of the function's entry point from *this.
Offset RelativeOffset;
......
// 通过偏移量计算 同时还返回泛型T类型
PointerTy get() const & {
// Check for null.
if (Nullable && RelativeOffset == 0)
return nullptr;
// The value is addressed relative to `this`.
uintptr_t absolute = detail::applyRelativeOffset(this, RelativeOffset);
return reinterpret_cast<PointerTy>(absolute);
}
......
}
/// Apply a relative offset to a base pointer. The offset is applied to the base
/// pointer using sign-extended, wrapping arithmetic.
// 通过偏移量计算
template<typename BasePtrTy, typename Offset>
static inline uintptr_t applyRelativeOffset(BasePtrTy *basePtr, Offset offset) {
static_assert(std::is_integral<Offset>::value &&
std::is_signed<Offset>::value,
"offset type should be signed integer");
auto base = reinterpret_cast<uintptr_t>(basePtr);
// We want to do wrapping arithmetic, but with a sign-extended
// offset. To do this in C, we need to do signed promotion to get
// the sign extension, but we need to perform arithmetic on unsigned values,
// since signed overflow is undefined behavior.
auto extendOffset = (uintptr_t)(intptr_t)offset;
// 指针地址+存放的offset(偏移地址) -- 内存平移获取值
return base + extendOffset;
}
那么我们就可以TargetRelativeDirectPointer的结构:
// 传入泛型Pointee
struct TargetRelativeDirectPointer<Pointee> {
var offset: Int32
// 通过偏移量计算内存
mutating func getmeasureRelativeOffset() -> UnsafeMutablePointer<Pointee> {
let offset = self.offset
return withUnsafePointer(to: &self) { p in
// 使用advanced偏移offset,再重新绑定成Pointee类型
return UnsafeMutablePointer(mutating: UnsafeRawPointer(p).advanced(by: numericCast(offset)).assumingMemoryBound(to: Pointee.self))
}
}
}
同时我们就可以修正TargetStructDescriptor为:
struct TargetStructDescriptor {
// 用于表示描述context的标志,包含kind和version
var flags: Int32
// 用于表示父类的context,如果是在顶层,则表示没有父类,则为NULL
var parent: Int32// 由于不去解析,暂时定义为Int32
// 类型的名称
var name: TargetRelativeDirectPointer<CChar>
// 指向此类型的元数据访问函数的指针
var accessFunctionPointer: TargetRelativeDirectPointer<UnsafeRawPointer>
// 指向类型的字段描述符的指针
var fieldDescriptor: TargetRelativeDirectPointer<FieldDescriptor>
// 记录属性的count
var numFields: Int32
// 记录属性在metadata中的偏移量
var fieldOffsetVectorOffset: Int32
}
// TargetRelativeContextPointer暂时不解析,通过源码分析可得暂时解析为Int32
template<typename Runtime,
template<typename _Runtime> class Context = TargetContextDescriptor>
using TargetRelativeContextPointer =
RelativeIndirectablePointer<const Context<Runtime>,
/*nullable*/ true, int32_t,
TargetSignedContextPointer<Runtime, Context>>;
FieldDescriptor和FieldRecord
再下一步,我们开始解析FieldDescriptor,源码中FieldDescriptor如下:
// Field descriptors contain a collection of field records for a single
// class, struct or enum declaration.
class FieldDescriptor {
const FieldRecord *getFieldRecordBuffer() const {
return reinterpret_cast<const FieldRecord *>(this + 1);
}
public:
const RelativeDirectPointer<const char> MangledTypeName;
const RelativeDirectPointer<const char> Superclass;
FieldDescriptor() = delete;
const FieldDescriptorKind Kind;
const uint16_t FieldRecordSize;
const uint32_t NumFields;
......
// 获取所有属性,每个属性用FieldRecord封装
llvm::ArrayRef<FieldRecord> getFields() const {
return {getFieldRecordBuffer(), NumFields};
}
......
}
// FieldDescriptorKin就是 Int16
enum class FieldDescriptorKind : uint16_t {
......
}
FieldRecord在源码中的结构为:
class FieldRecord {
const FieldRecordFlags Flags;
public:
const RelativeDirectPointer<const char> MangledTypeName;
const RelativeDirectPointer<const char> FieldName;
......
}
// Field records describe the type of a single stored property or case member
// of a class, struct or enum.
// FieldRecordFlags 就是Int32
class FieldRecordFlags {
using int_type = uint32_t;
......
}
fieldOffsetVectorOffset计算偏移量
最后还有fieldOffsetVectorOffset(记录属性在metadata中的偏移量)的计算,来获取属性再metadata中的偏移量。源码中能得到的资料是:
// StoredPointer 是Int32 即会返回一个Int32
/// Get a pointer to the field offset vector, if present, or null.
const StoredPointer *getFieldOffsets() const {
assert(isTypeMetadata());
auto offset = getDescription()->getFieldOffsetVectorOffset();
if (offset == 0)
return nullptr;
auto asWords = reinterpret_cast<const void * const*>(this);
return reinterpret_cast<const StoredPointer *>(asWords + offset);
}
但是以这个逻辑去处理,获取的数据是不对的,所以我从HandyJSON的源码中找到了这个:
// 当时64位是 offset 会乘以2
return Int(UnsafePointer<Int32>(pointer)[vectorOffset * (is64BitPlatform ? 2 : 1) + $0])
分析到这里,我们就得到了一个比较清晰地结构线,如下:
// 通过偏移量计算内存地址 传入泛型Pointee
struct TargetRelativeDirectPointer<Pointee> {
var offset: Int32
// 通过偏移量计算内存
mutating func getmeasureRelativeOffset() -> UnsafeMutablePointer<Pointee> {
let offset = self.offset
return withUnsafePointer(to: &self) { p in
// 使用advanced偏移offset,再重新绑定成Pointee类型
return UnsafeMutablePointer(mutating: UnsafeRawPointer(p).advanced(by: numericCast(offset)).assumingMemoryBound(to: Pointee.self))
}
}
}
struct TargetStructMetadata {
var kind: Int
var typeDescriptor: UnsafeMutablePointer<TargetStructDescriptor>
}
struct TargetStructDescriptor {
var flags: Int32
var parent: Int32
var name: TargetRelativeDirectPointer<CChar>
var accessFunctionPointer: TargetRelativeDirectPointer<UnsafeRawPointer>
var fieldDescriptor: TargetRelativeDirectPointer<FieldDescriptor>
var numFields: Int32
var fieldOffsetVectorOffset: Int32
func getFieldOffsets(_ metadata: UnsafeRawPointer) -> UnsafePointer<Int32> {
print(metadata)
return metadata.assumingMemoryBound(to: Int32.self).advanced(by: numericCast(self.fieldOffsetVectorOffset) * 2)
}
// 计算元型时使用
var genericArgumentOffset: Int {
return 2
}
}
struct FieldDescriptor {
var MangledTypeName: TargetRelativeDirectPointer<CChar>
var Superclass: TargetRelativeDirectPointer<CChar>
var kind: UInt16
var fieldRecordSize: Int16
var numFields: Int32
var fields: FieldRecordBuffer<FieldRecord>
}
struct FieldRecord {
var fieldRecordFlags: Int32
var mangledTypeName: TargetRelativeDirectPointer<CChar>
var fieldName: TargetRelativeDirectPointer<UInt8>
}
// 获取FieldRecord
struct FieldRecordBuffer<Element> {
var element: Element
mutating func buffer(n: Int) -> UnsafeBufferPointer<Element> {
return withUnsafePointer(to: &self) {
let ptr = $0.withMemoryRebound(to: Element.self, capacity: 1) { start in
return start
}
return UnsafeBufferPointer(start: ptr, count: n)
}
}
mutating func index(of i: Int) -> UnsafeMutablePointer<Element> {
return withUnsafePointer(to: &self) {
return UnsafeMutablePointer(mutating: UnsafeRawPointer($0).assumingMemoryBound(to: Element.self).advanced(by: i))
}
}
}
代码的验证
下面我们就代码来验证我们得到的这个结构。
protocol BrigeProtocol {}
extension BrigeProtocol {
// 通过协议重新绑定类型 返回出去
static func get(from pointor: UnsafeRawPointer) -> Any {
// Self就是真实的类型
pointor.assumingMemoryBound(to: Self.self).pointee
}
}
struct BrigeMetadataStruct {
let type: Any.Type
let witness: Int
}
func custom(type: Any.Type) -> BrigeProtocol.Type {
let container = BrigeMetadataStruct(type: type, witness: 0)
let cast = unsafeBitCast(container, to: BrigeProtocol.Type.self)
return cast
}
// LLPerson结构体
struct LLPerson {
var age: Int = 18
var name: String = "LL"
var nameTwo: String = "LLLL"
}
// 创建一个实例
var p = LLPerson()
// LLPerson的metadata按位塞入TargetStructMetadata这个metadata中,LLPerson.self就是UnsafeMutablePointer<TargetStructMetadata>.self
let ptr = unsafeBitCast(LLPerson.self as Any.Type, to: UnsafeMutablePointer<TargetStructMetadata>.self)
// 拿到结构体名称
let namePtr = ptr.pointee.typeDescriptor.pointee.name.getmeasureRelativeOffset()
print("当前 struct name: \(String(cString: namePtr))")
// 拿到属性个数
let numFields = ptr.pointee.typeDescriptor.pointee.numFields
print("当前类属性个数: \(numFields)")
// 拿到属性再metadata中的偏移量
let offsets = ptr.pointee.typeDescriptor.pointee.getFieldOffsets(UnsafeRawPointer(ptr).assumingMemoryBound(to: Int.self))
print("----------- start fetch field -------------")
for i in 0..<numFields {
// 获取属性名
let fieldName = ptr.pointee.typeDescriptor.pointee.fieldDescriptor.getmeasureRelativeOffset().pointee.fields.index(of: Int(i)).pointee.fieldName.getmeasureRelativeOffset()
print("----- field \(String(cString: fieldName)) -----")
// 拿到属性对应的偏移量 按字节偏移的
let fieldOffset = offsets[Int(i)]
print("\(String(cString: fieldName)) 的偏移量是:\(fieldOffset)字节")
// 这是swift混写过的类型名称 需要把它转成真正的类型名称
let typeMangleName = ptr.pointee.typeDescriptor.pointee.fieldDescriptor.getmeasureRelativeOffset().pointee.fields.index(of: Int(i)).pointee.mangledTypeName.getmeasureRelativeOffset()
// print("\(String(cString: typeMangleName))")
let genericVector = UnsafeRawPointer(ptr).advanced(by: ptr.pointee.typeDescriptor.pointee.genericArgumentOffset * MemoryLayout<UnsafeRawPointer>.size).assumingMemoryBound(to: Any.Type.self)
// 需要用到这个库函数 swift_getTypeByMangledNameInContext 传递四个参数
let fieldType = swift_getTypeByMangledNameInContext(
typeMangleName, // 混写过后的名称
256, // 混写过后的名称信息长度,需要计算 HandyJSON中直接 256
UnsafeRawPointer(ptr.pointee.typeDescriptor), // 上下文 typeDescriptor中
UnsafeRawPointer(genericVector).assumingMemoryBound(to: Optional<UnsafeRawPointer>.self)) //当前的泛型参数 还原符号信息
// 将fieldType按位塞入Any
let type = unsafeBitCast(fieldType, to: Any.Type.self)
// 通过协议桥接获取我们的真实类型信息
let value = custom(type: type)
//获取实例对象p的指针 需要转换成UnsafeRawPointer 并且绑定成1字节即Int8类型,
//因为后面是按字节计算偏移量的,不转换,会以结构体的长度偏移
let instanceAddress = withUnsafePointer(to: &p){return UnsafeRawPointer($0).assumingMemoryBound(to: Int8.self)}
print("fieldTyoe: \(type) \nfieldValue: \(value.get(from: instanceAddress.advanced(by: Int(fieldOffset))))")
}
print("----------- end fetch field -------------")
打印信息:

从内存地址我们也可以看出属性的布局信息。











