HotStuff is a Byzantine fault-tolerant state machine replication protocol that incurs linear communication costs to achieve consensus. This linear scalability promoted the protocol to be adopted as the consensus mechanism in permissioned blockchains. This paper discusses the architecture, testing, and evaluation of our extensible framework to implement HotStuff and its variants. The framework already contains three HotStuff variants and other interchangeable components for cryptographic operations and leader selection.Inspired by the Twins approach, we also provide a testing framework for validating protocol implementations by inducing Byzantine behaviors. Test generation is protocol-agnostic; new protocols can execute the test suite with little-to-no modifications. We report relevant insights on how we benefited from Twins for validation and test-driven development. Leveraging our deployment tool, we evaluated our implementation in various configurations.